Skip to main content

2019 | OriginalPaper | Buchkapitel

SPEEDING Up the Metabolism in E-commerce by Reinforcement Mechanism DESIGN

verfasst von : Hua-Lin He, Chun-Xiang Pan, Qing Da, An-Xiang Zeng

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In a large E-commerce platform, all the participants compete for impressions under the allocation mechanism of the platform. Existing methods mainly focus on the short-term return based on the current observations instead of the long-term return. In this paper, we formally establish the lifecycle model for products, by defining the introduction, growth, maturity and decline stages and their transitions throughout the whole life period. Based on such model, we further propose a reinforcement learning based mechanism design framework for impression allocation, which incorporates the first principal component based permutation and the novel experiences generation method, to maximize short-term as well as long-term return of the platform. With the power of trial-and-error, it is possible to recognize in advance the potentially hot products in the introduction stage as well as the potentially slow-selling products in the decline stage, so the metabolism can be speeded up by an optimal impression allocation strategy. We evaluate our algorithm on a simulated environment built based on one of the largest E-commerce platforms, and a significant improvement has been achieved in comparison with the baseline solutions. Code related to this paper is available at: https://​github.​com/​WXFMAV/​lifecycle_​open.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abdi, H., Williams, L.J.: Principal component analysis. Wiley interdisc. Rev.: Comput. Stat. 2(4), 433–459 (2010)CrossRef Abdi, H., Williams, L.J.: Principal component analysis. Wiley interdisc. Rev.: Comput. Stat. 2(4), 433–459 (2010)CrossRef
2.
Zurück zum Zitat Burges, C., et al.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. ACM (2005) Burges, C., et al.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. ACM (2005)
3.
Zurück zum Zitat Cai, Q., Filos-Ratsikas, A., Tang, P., Zhang, Y.: Reinforcement mechanism design for e-commerce. CoRR abs/1708.07607 (2017) Cai, Q., Filos-Ratsikas, A., Tang, P., Zhang, Y.: Reinforcement mechanism design for e-commerce. CoRR abs/1708.07607 (2017)
4.
Zurück zum Zitat Cai, Q., Filos-Ratsikas, A., Tang, P., Zhang, Y.: Reinforcement mechanism design for fraudulent behaviour in e-commerce (2018) Cai, Q., Filos-Ratsikas, A., Tang, P., Zhang, Y.: Reinforcement mechanism design for fraudulent behaviour in e-commerce (2018)
5.
Zurück zum Zitat Cao, H., Folan, P.: Product life cycle: the evolution of a paradigm and literature review from 1950–2009. Prod. Plann. Control 23(8), 641–662 (2012)CrossRef Cao, H., Folan, P.: Product life cycle: the evolution of a paradigm and literature review from 1950–2009. Prod. Plann. Control 23(8), 641–662 (2012)CrossRef
6.
Zurück zum Zitat Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning, pp. 129–136. ACM (2007) Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning, pp. 129–136. ACM (2007)
7.
Zurück zum Zitat Cheng, Y.H., Yi, J.Q., Zhao, D.B.: Application of actor-critic learning to adaptive state space construction. In: 2004 Proceedings of 2004 International Conference on Machine Learning and Cybernetics, vol. 5, pp. 2985–2990. IEEE (2004) Cheng, Y.H., Yi, J.Q., Zhao, D.B.: Application of actor-critic learning to adaptive state space construction. In: 2004 Proceedings of 2004 International Conference on Machine Learning and Cybernetics, vol. 5, pp. 2985–2990. IEEE (2004)
8.
Zurück zum Zitat Deng, Y., Shen, Y., Jin, H.: Disguise adversarial networks for click-through rate prediction. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 1589–1595. AAAI Press (2017) Deng, Y., Shen, Y., Jin, H.: Disguise adversarial networks for click-through rate prediction. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 1589–1595. AAAI Press (2017)
11.
Zurück zum Zitat Levitt, T.: Exploit the product life cycle. Harvard Bus. Rev. 43, 81–94 (1965) Levitt, T.: Exploit the product life cycle. Harvard Bus. Rev. 43, 81–94 (1965)
13.
Zurück zum Zitat Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992) Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)
14.
Zurück zum Zitat Linden, G., Smith, B., York, J.: Amazon. com recommendations: item-to-item collaborative filtering. IEEE Internet comput. 7(1), 76–80 (2003)CrossRef Linden, G., Smith, B., York, J.: Amazon. com recommendations: item-to-item collaborative filtering. IEEE Internet comput. 7(1), 76–80 (2003)CrossRef
17.
18.
Zurück zum Zitat Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)MathSciNetCrossRef Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)MathSciNetCrossRef
19.
Zurück zum Zitat Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)CrossRef Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)CrossRef
20.
Zurück zum Zitat Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387–395 (2014) Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387–395 (2014)
21.
Zurück zum Zitat Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000) Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
22.
Zurück zum Zitat Tang, P.: Reinforcement mechanism design. In: Early Carrer Highlights at Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI, pp. 5146–5150 (2017) Tang, P.: Reinforcement mechanism design. In: Early Carrer Highlights at Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI, pp. 5146–5150 (2017)
23.
24.
Zurück zum Zitat Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning (2016) Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning (2016)
Metadaten
Titel
SPEEDING Up the Metabolism in E-commerce by Reinforcement Mechanism DESIGN
verfasst von
Hua-Lin He
Chun-Xiang Pan
Qing Da
An-Xiang Zeng
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-10997-4_7