Top

Published in:

2023 | OriginalPaper | Chapter

Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction

Authors : Lei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu Sun

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Traditional knowledge distillation in classification problems transfers the knowledge via class correlations in the soft label produced by teacher models, which are not available in regression problems like stock trading volume prediction. To remedy this, we present a novel distillation framework for training a light-weight student model to perform trading volume prediction given historical transaction data. Specifically, we turn the regression model into a probabilistic forecasting model, by training models to predict a Gaussian distribution to which the trading volume belongs. The student model can thus learn from the teacher at a more informative distributional level, by matching its predicted distributions to that of the teacher. Two correlational distillation objectives are further introduced to encourage the student to produce consistent pair-wise relationships with the teacher model. We evaluate the framework on a real-world stock volume dataset with two different time window settings. Experiments demonstrate that our framework is superior to strong baseline models, compressing the model size by \(5\times \) while maintaining \(99.6\%\) prediction accuracy. The extensive analysis further reveals that our framework is more effective than vanilla distillation methods under low-resource scenarios. Our code and data are available at https://github.com/lancopku/DCKD.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter TS-MIoU: A Time Series Similarity Metric Without Mapping

next chapter Banksformer: A Deep Generative Model for Synthetic Transaction Sequences

Available only for authorised users

Antulov-Fantulin, N., Guo, T., Lillo, F.: Temporal mixture ensemble models for intraday volume forecasting in cryptocurrency exchange markets. arXiv Trading and Market Microstructure (2020)

Białkowski, J., Darolles, S., Le Fol, G.: Improving vwap strategies: a dynamic volume approach. J. Bank. Finan. 32(9), 1709–1722 (2008)CrossRef

Brownlees, C.T., Cipollini, F., Gallo, G.M.: Intra-daily volume modeling and prediction for algorithmic trading. J. Finan. Econ. 9(3), 489–518 (2011)

Cartea, Á., Jaimungal, S.: A closed-form execution strategy to target volume weighted average price. SIAM J. Finan. Math. 7(1), 760–785 (2016)MathSciNetCrossRef

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)

Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: ICLR (2020)

Furlanello, T., Lipton, Z.C., Tschannen, M., Itti, L., Anandkumar, A.: Born-again neural networks. In: ICML. Proceedings of Machine Learning Research, vol. 80, pp. 1602–1611 (2018)

Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

10.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

11.

Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: NeurIPS, pp. 4107–4115 (2016)

12.

Huptas, R.: Point forecasting of intraday volume using bayesian autoregressive conditional volume models. J. Forecast. (2018)

13.

Jiao, X., et al.: Tinybert: distilling bert for natural language understanding. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 4163–4174 (2020)

14.

Li, L., et al.: CascadeBERT: accelerating inference of pre-trained language models via calibrated complete models cascade. In: Findings of the Association for Computational Linguistics: EMNLP, pp. 475–486 (2021)

15.

Li, L., Lin, Y., Ren, S., Li, P., Zhou, J., Sun, X.: Dynamic knowledge distillation for pre-trained language models. In: EMNLP, pp. 379–389 (2021)

16.

Li, L., et al.: Model uncertainty-aware knowledge amalgamation for pre-trained language models. arXiv preprint arXiv:2112.07327 (2021)

17.

Liang, K.J., et al.: MixKD: towards efficient distillation of large-scale language models. In: ICLR (2021)

18.

Libman, D.S., Haber, S., Schaps, M.: Volume prediction with neural networks. Front. Artif. Intell. 2 (2019)

19.

Liu, X., Lai, K.K.: Intraday volume percentages forecasting using a dynamic svm-based approach. J. Syst. Sci. Complex. 30(2), 421–433 (2017)CrossRef

20.

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2019)

21.

Mirzadeh, S., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. In: AAAI, pp. 5191–5198 (2020)

22.

Pardo, L.: Statistical Inference Based on Divergence Measures. Chapman and Hall/CRC, Boca Raton (2018)CrossRef

23.

Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: CVPR, pp. 3967–3976 (2019)

24.

Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. In: ICLR (2015)

25.

Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: Deepar: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2020)CrossRef

26.

Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In: NeurIPS Workshop on Energy Efficient Machine Learning and Cognitive Computing (2019)

27.

Saputra, M.R.U., de Gusmão, P.P.B., Almalioglu, Y., Markham, A., Trigoni, N.: Distilling knowledge from a deep pose regressor network. In: ICCV, pp. 263–272 (2019)

28.

Shen, S., et al.: Q-BERT: hessian based ultra low precision quantization of BERT. In: AAAI, pp. 8815–8821 (2020)

29.

Sun, S., Cheng, Y., Gan, Z., Liu, J.: Patient knowledge distillation for BERT model compression. In: EMNLP-IJCNLP, pp. 4323–4332 (2019)

30.

Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)

31.

Xu, J., Zhou, W., Fu, Z., Zhou, H., Li, L.: A survey on green deep learning. arXiv preprint arXiv:2111.05193 (2021)

32.

Zhang, Z., Li, W., Bao, R., Harimoto, K., Wu, Y., Sun, X.: ASAT: adaptively scaled adversarial training in time series. arXiv preprint arXiv:2108.08976 (2021)

33.

Zhao, L., Li, W., Bao, R., Harimoto, K., Wu, Y., Sun, X.: Long-term, short-term and sudden event: trading volume movement prediction with graph-based multi-view modeling. In: Zhou, Z. (ed.) IJCAI, pp. 3764–3770 (2021)

Title: Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction
Authors: Lei Li
Zhiyuan Zhang
Ruihan Bao
Keiko Harimoto
Xu Sun
Publisher: Springer Nature Switzerland
Book: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-031-26421-4

Electronic ISBN: 978-3-031-26422-1

Copyright Year: 2023
DOI: https://doi.org/10.1007/978-3-031-26422-1_7

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner