Skip to main content
Top

2024 | OriginalPaper | Chapter

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

Authors : Xi Cheng, Jinghao Zhang, Yunan Zeng, Wenfang Xue

Published in: Advances in Knowledge Discovery and Data Mining

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Transaction costs are charged as a percentage of the contract.
 
2
Slippage refers to the difference between the expected and the actual execution price.
 
3
A well-known Chinese quantitative trading platform, https://​www.​ricequant.​com/​.
 
4
We chose it as a baseline because we employed the GRU method in the Pretrain Module before imitation learning. The results of GRU demonstrate the performance of the Pretrain Module.
 
5
We enhance PPO using imitation learning mentioned in Methodology Section.
 
Literature
1.
go back to reference Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014) Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:​1412.​3555 (2014)
2.
go back to reference Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: NIPS, vol. 26 (2013) Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: NIPS, vol. 26 (2013)
3.
go back to reference Deng, Y., Bao, F., Kong, Y., Ren, Z., Dai, Q.: Deep direct reinforcement learning for financial signal representation and trading. IEEE TNNLS 28(3), 653–664 (2016) Deng, Y., Bao, F., Kong, Y., Ren, Z., Dai, Q.: Deep direct reinforcement learning for financial signal representation and trading. IEEE TNNLS 28(3), 653–664 (2016)
4.
go back to reference Fama, E.F., French, K.R.: Multifactor explanations of asset pricing anomalies. J. Financ. 51(1), 55–84 (1996)CrossRef Fama, E.F., French, K.R.: Multifactor explanations of asset pricing anomalies. J. Financ. 51(1), 55–84 (1996)CrossRef
5.
go back to reference Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. JMLR 23(1), 5232–5270 (2022)MathSciNet Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. JMLR 23(1), 5232–5270 (2022)MathSciNet
6.
go back to reference Gurrib, I., et al.: Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks Bank Syst. 13(3), 58–70 (2018)CrossRef Gurrib, I., et al.: Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks Bank Syst. 13(3), 58–70 (2018)CrossRef
7.
go back to reference Hong, H., Stein, J.C.: A unified theory of underreaction, momentum trading, and overreaction in asset markets. J. Financ. 54(6), 2143–2184 (1999)CrossRef Hong, H., Stein, J.C.: A unified theory of underreaction, momentum trading, and overreaction in asset markets. J. Financ. 54(6), 2143–2184 (1999)CrossRef
8.
go back to reference Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799. PMLR (2019) Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799. PMLR (2019)
10.
go back to reference Jegadeesh, N., Titman, S.: Returns to buying winners and selling losers: implications for stock market efficiency. J. Financ. 48(1), 65–91 (1993)CrossRef Jegadeesh, N., Titman, S.: Returns to buying winners and selling losers: implications for stock market efficiency. J. Financ. 48(1), 65–91 (1993)CrossRef
11.
go back to reference Jegadeesh, N., Titman, S.: Cross-sectional and time-series determinants of momentum returns. Rev. Financ. Stud. 15(1), 143–157 (2002)CrossRef Jegadeesh, N., Titman, S.: Cross-sectional and time-series determinants of momentum returns. Rev. Financ. Stud. 15(1), 143–157 (2002)CrossRef
12.
go back to reference Jeong, G., Kim, H.Y.: Improving financial trading decisions using deep q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Syst. Appl. 117, 125–138 (2019)CrossRef Jeong, G., Kim, H.Y.: Improving financial trading decisions using deep q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Syst. Appl. 117, 125–138 (2019)CrossRef
13.
go back to reference Kim, H.J., Shin, K.S.: A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets. Appl. Soft Comput. 7(2), 569–576 (2007)CrossRef Kim, H.J., Shin, K.S.: A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets. Appl. Soft Comput. 7(2), 569–576 (2007)CrossRef
14.
15.
go back to reference Lin, H., Zhou, D., Liu, W., Bian, J.: Learning multiple stock trading patterns with temporal routing adaptor and optimal transport. In: 27th ACM SIGKDD, pp. 1017–1026 (2021) Lin, H., Zhou, D., Liu, W., Bian, J.: Learning multiple stock trading patterns with temporal routing adaptor and optimal transport. In: 27th ACM SIGKDD, pp. 1017–1026 (2021)
16.
go back to reference Liu, Y., Liu, Q., Zhao, H., Pan, Z., Liu, C.: Adaptive quantitative trading: an imitative deep reinforcement learning approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2128–2135 (2020) Liu, Y., Liu, Q., Zhao, H., Pan, Z., Liu, C.: Adaptive quantitative trading: an imitative deep reinforcement learning approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2128–2135 (2020)
17.
go back to reference Moody, J., Saffell, M.: Reinforcement learning for trading. In: NIPS, vol. 11 (1998) Moody, J., Saffell, M.: Reinforcement learning for trading. In: NIPS, vol. 11 (1998)
18.
go back to reference Moody, J., Wu, L.: Optimization of trading systems and portfolios. In: Proceedings of the IEEE/IAFE 1997 CIFEr, pp. 300–307. IEEE (1997) Moody, J., Wu, L.: Optimization of trading systems and portfolios. In: Proceedings of the IEEE/IAFE 1997 CIFEr, pp. 300–307. IEEE (1997)
19.
go back to reference de Oliveira, R.A., Ramos, H.S., Dalip, D.H., Pereira, A.C.M.: A tabular sarsa-based stock market agent. In: Proceedings of the First ACM International Conference on AI in Finance, pp. 1–8 (2020) de Oliveira, R.A., Ramos, H.S., Dalip, D.H., Pereira, A.C.M.: A tabular sarsa-based stock market agent. In: Proceedings of the First ACM International Conference on AI in Finance, pp. 1–8 (2020)
20.
go back to reference Poterba, J.M., Summers, L.H.: Mean reversion in stock prices: evidence and implications. J. Financ. Econ. 22(1), 27–59 (1988)CrossRef Poterba, J.M., Summers, L.H.: Mean reversion in stock prices: evidence and implications. J. Financ. Econ. 22(1), 27–59 (1988)CrossRef
21.
22.
go back to reference Ritter, J.R.: Behavioral finance. Pac.-Basin Finance J. 11(4), 429–437 (2003)CrossRef Ritter, J.R.: Behavioral finance. Pac.-Basin Finance J. 11(4), 429–437 (2003)CrossRef
23.
go back to reference Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017) Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:​1707.​06347 (2017)
24.
25.
go back to reference Si, W., Li, J., Ding, P., Rao, R.: A multi-objective deep reinforcement learning approach for stock index future’s intraday trading. In: 2017 10th ISCID, vol. 2, pp. 431–436. IEEE (2017) Si, W., Li, J., Ding, P., Rao, R.: A multi-objective deep reinforcement learning approach for stock index future’s intraday trading. In: 2017 10th ISCID, vol. 2, pp. 431–436. IEEE (2017)
26.
go back to reference Tsang, W.W.H., Chong, T.T.L., et al.: Profitability of the on-balance volume indicator. Econ. Bull. 29(3), 2424–2431 (2009) Tsang, W.W.H., Chong, T.T.L., et al.: Profitability of the on-balance volume indicator. Econ. Bull. 29(3), 2424–2431 (2009)
27.
go back to reference Wilder, J.W.: New concepts in technical trading systems. Trend Research (1978) Wilder, J.W.: New concepts in technical trading systems. Trend Research (1978)
28.
go back to reference Xu, W., et al.: HIST: a graph-based framework for stock trend forecasting via mining concept-oriented shared information. arXiv preprint arXiv:2110.13716 (2021) Xu, W., et al.: HIST: a graph-based framework for stock trend forecasting via mining concept-oriented shared information. arXiv preprint arXiv:​2110.​13716 (2021)
29.
go back to reference Xu, W., Liu, W., Xu, C., Bian, J., Yin, J., Liu, T.Y.: Rest: relational event-driven stock trend forecasting. In: Proceedings of the Web Conference 2021, pp. 1–10 (2021) Xu, W., Liu, W., Xu, C., Bian, J., Yin, J., Liu, T.Y.: Rest: relational event-driven stock trend forecasting. In: Proceedings of the Web Conference 2021, pp. 1–10 (2021)
30.
go back to reference Yuan, Y., Wen, W., Yang, J.: Using data augmentation based reinforcement learning for daily stock trading. Electronics 9(9), 1384 (2020)CrossRef Yuan, Y., Wen, W., Yang, J.: Using data augmentation based reinforcement learning for daily stock trading. Electronics 9(9), 1384 (2020)CrossRef
Metadata
Title
MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading
Authors
Xi Cheng
Jinghao Zhang
Yunan Zeng
Wenfang Xue
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2238-9_3

Premium Partner