Skip to main content
Top

2021 | OriginalPaper | Chapter

Reinforcement Learning Approach for Dynamic Pricing

Authors : Maksim Balashov, Anton Kiselev, Alena Kuryleva

Published in: The Economics of Digital Transformation

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the introduction of digital technologies, it becomes easier for customers to compare prices and choose the product that is most profitable for them, this leads to instability of demand, which means that there is a need for market players to review pricing policies in favor of one that could take into account the characteristics of producer’s resources and current demand status.
Dynamic pricing seems to be an adequate solution to the problem, as it is adaptive to customer expectations. In addition, with the digitalization of the economy, unique opportunities arise for using this apparatus.
The purpose of this study is to evaluate the possibility of applying the concept of dynamic pricing to traditional retail.
The goal of solving the dynamic pricing problem in the framework of this study is to maximize profits from the sale of a specific associated product at an automatic gas station.
To solve this problem, the authors propose using machine learning approaches that adapt to the external environment, one of which is reinforcement learning (RL). At the same time, an approach is proposed to restore the demand surface for subsequent training of the agent.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aviv, Y., & Pazgal, A. (2002). Pricing of short life-cycle products through active learning. Under revision for Management Science. Aviv, Y., & Pazgal, A. (2002). Pricing of short life-cycle products through active learning. Under revision for Management Science.
go back to reference Carroll, W. J., & Grimes, R. C. (1995). Evolutionary change in product management: Experiences in the car rental industry. Interfaces, t., 25(5), 84–104.CrossRef Carroll, W. J., & Grimes, R. C. (1995). Evolutionary change in product management: Experiences in the car rental industry. Interfaces, t., 25(5), 84–104.CrossRef
go back to reference Carvalho, A. X., Puterman, M. L. (2003). Dynamic pricing and learning over short time horizons. Sander School of Business, UBC, Working Paper. Carvalho, A. X., Puterman, M. L. (2003). Dynamic pricing and learning over short time horizons. Sander School of Business, UBC, Working Paper.
go back to reference Chen, L., Mislove, A., & Wilson, C. (2016). An empirical analysis of algorithmic pricing on amazon marketplace. In Proceedings of the 25th international conference on world wide web (pp. 1339–1349). International World Wide Web Conferences Steering Committee. Chen, L., Mislove, A., & Wilson, C. (2016). An empirical analysis of algorithmic pricing on amazon marketplace. In Proceedings of the 25th international conference on world wide web (pp. 1339–1349). International World Wide Web Conferences Steering Committee.
go back to reference Chinthalapati, V. L. R., Yadati, N., & Karumanchi, R. (2006). Learning dynamic prices in multi-seller electronic retail markets with price-sensitive customers, stochastic demands, and inventory replenishments. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 36(1), 92–106.CrossRef Chinthalapati, V. L. R., Yadati, N., & Karumanchi, R. (2006). Learning dynamic prices in multi-seller electronic retail markets with price-sensitive customers, stochastic demands, and inventory replenishments. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 36(1), 92–106.CrossRef
go back to reference Ciancimino, A., et al. (1999). A mathematical programming approach for the solution of the railway yield management problem. Transportation Science., 33(2), 168–181.CrossRef Ciancimino, A., et al. (1999). A mathematical programming approach for the solution of the railway yield management problem. Transportation Science., 33(2), 168–181.CrossRef
go back to reference Deksnyte, I., Lydeka, Z. (2012). Dynamic pricing and its forming factors. International Journal of Business and Social Science, 3, no. 23. Deksnyte, I., Lydeka, Z. (2012). Dynamic pricing and its forming factors. International Journal of Business and Social Science, 3, no. 23.
go back to reference Dimicco, J. M., Maes, P., & Greenwald, A. (2003). Learning curve: A simulation-based approach to dynamic pricing. Electronic Commerce Research, 3(3–4), 245–276.CrossRef Dimicco, J. M., Maes, P., & Greenwald, A. (2003). Learning curve: A simulation-based approach to dynamic pricing. Electronic Commerce Research, 3(3–4), 245–276.CrossRef
go back to reference Gupta, M., Ravikumar, K., & Kumar, M. (2002). Adaptive strategies for price markdown in a multi-unit descending price auction: A comparative study. IEEE International Conference on Systems, Man and Cybernetics, 1, 373–378.CrossRef Gupta, M., Ravikumar, K., & Kumar, M. (2002). Adaptive strategies for price markdown in a multi-unit descending price auction: A comparative study. IEEE International Conference on Systems, Man and Cybernetics, 1, 373–378.CrossRef
go back to reference Hayes, D. K., & Miller, A. A. (2011). Revenue Management for the Hospitality Industry. Hoboken, NJ: Wiley. Hayes, D. K., & Miller, A. A. (2011). Revenue Management for the Hospitality Industry. Hoboken, NJ: Wiley.
go back to reference Hester, T. et al. (2018). Deep q-learning from demonstrations. Thirty-second AAAI conference on artificial intelligence. Hester, T. et al. (2018). Deep q-learning from demonstrations. Thirty-second AAAI conference on artificial intelligence.
go back to reference Jintian, W., & Lei, Z. (2009). Application of reinforcement learning in dynamic pricing algorithms. In IEEE international conference on automation and logistics (pp. 419–423). Jintian, W., & Lei, Z. (2009). Application of reinforcement learning in dynamic pricing algorithms. In IEEE international conference on automation and logistics (pp. 419–423).
go back to reference Kephart, J. O., Hanson, J. E., & Greenwald, A. R. (2000). Dynamic pricing by software agents. Computer Networks, 32(6), 731–752.CrossRef Kephart, J. O., Hanson, J. E., & Greenwald, A. R. (2000). Dynamic pricing by software agents. Computer Networks, 32(6), 731–752.CrossRef
go back to reference Lawrence, R. D. (2003). A machine-learning approach to optimal bid pricing. Computational modeling and problem solving in the networked world, Springer, Boston, MA, pp. 97–118. Lawrence, R. D. (2003). A machine-learning approach to optimal bid pricing. Computational modeling and problem solving in the networked world, Springer, Boston, MA, pp. 97–118.
go back to reference Littlewood, K. (1972). Forecasting and control of passenger bookings. Airline Group International Federation of Operational Research Societies Proceedings, t. 12 (pp. 95–117). Littlewood, K. (1972). Forecasting and control of passenger bookings. Airline Group International Federation of Operational Research Societies Proceedings, t. 12 (pp. 95–117).
go back to reference Liu, G., & Wang, H. (2013). An online sequential feed-forward network model for demand curve prediction. Journal of information & computational science, 10(10), 3063–3069.CrossRef Liu, G., & Wang, H. (2013). An online sequential feed-forward network model for demand curve prediction. Journal of information & computational science, 10(10), 3063–3069.CrossRef
go back to reference Liu, J., et al. (2018). Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning. International conference on learning representations. New Orleans, Louisiana, United States. Liu, J., et al. (2018). Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning. International conference on learning representations. New Orleans, Louisiana, United States.
go back to reference Lu, R., Hong, S. H., & Zhang, X. (2018). A dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach. Applied Energy, 220, 220–230.CrossRef Lu, R., Hong, S. H., & Zhang, X. (2018). A dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach. Applied Energy, 220, 220–230.CrossRef
go back to reference Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529.CrossRef Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529.CrossRef
go back to reference Mullen, P. B. et al. (2006). Particle swarm optimization in dynamic pricing. IEEE International Conference on Evolutionary Computation, pp. 1232–1239. Mullen, P. B. et al. (2006). Particle swarm optimization in dynamic pricing. IEEE International Conference on Evolutionary Computation, pp. 1232–1239.
go back to reference Nikolenko, S., Arhangelskaya, E., & Kadurin, A. (2020). Deep learning. Dive into the world of neural networks. Sankt-Petersburg: Publishing house Piter (in Russian). [Николенко С., Архангельская Е., Кадурин А. Глубокое обучение. Погружение в мир нейронных сетей. Санкт-Петербург, Издательство Питер, 2020 г. 480 с.]. Nikolenko, S., Arhangelskaya, E., & Kadurin, A. (2020). Deep learning. Dive into the world of neural networks. Sankt-Petersburg: Publishing house Piter (in Russian). [Николенко С., Архангельская Е., Кадурин А. Глубокое обучение. Погружение в мир нейронных сетей. Санкт-Петербург, Издательство Питер, 2020 г. 480 с.].
go back to reference Raju, C. V. L., Narahari, Y., & Ravikumar, K. (2006). Learning dynamic prices in electronic retail markets with customer segmentation. Annals of Operations Research, 143(1), 59–75.CrossRef Raju, C. V. L., Narahari, Y., & Ravikumar, K. (2006). Learning dynamic prices in electronic retail markets with customer segmentation. Annals of Operations Research, 143(1), 59–75.CrossRef
go back to reference Ramezani, S., Bosman, P. A. N., & La Poutré, H. (2011). Adaptive strategies for dynamic pricing agents. Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Volume 02. IEEE Computer Society, pp. 323–328. Ramezani, S., Bosman, P. A. N., & La Poutré, H. (2011). Adaptive strategies for dynamic pricing agents. Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Volume 02. IEEE Computer Society, pp. 323–328.
go back to reference Rothstein, M. (1971). An airline overbooking model. Transportation Science, 5(2), 180–192.CrossRef Rothstein, M. (1971). An airline overbooking model. Transportation Science, 5(2), 180–192.CrossRef
go back to reference Spedicato, G. A., Dutang, C., & Petrini, L. (2018). Machine learning methods to perform pricing optimization. A comparison with standard GLMs. Casualty actuarial society, 12(1), 69–89. Spedicato, G. A., Dutang, C., & Petrini, L. (2018). Machine learning methods to perform pricing optimization. A comparison with standard GLMs. Casualty actuarial society, 12(1), 69–89.
go back to reference Subrahmanyan, S., & Shoemaker, R. (1996). Developing optimal pricing and inventory policies for retailers who face uncertain demand. Journal of Retailing, 72(1), 7–30.CrossRef Subrahmanyan, S., & Shoemaker, R. (1996). Developing optimal pricing and inventory policies for retailers who face uncertain demand. Journal of Retailing, 72(1), 7–30.CrossRef
go back to reference Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
go back to reference Weatherford, L. R., & Bodily, S. E. (1992). A taxonomy and research overview of perishable-asset revenue management: Yield management, overbooking, and pricing. Operations Research, 40(5), 831–844.CrossRef Weatherford, L. R., & Bodily, S. E. (1992). A taxonomy and research overview of perishable-asset revenue management: Yield management, overbooking, and pricing. Operations Research, 40(5), 831–844.CrossRef
go back to reference Xia, L., Monroe, K. B., & Cox, J. L. (2004). The price is unfair! A conceptual framework of price fairness perceptions. Journal of Marketing, 68(4), 1–15.CrossRef Xia, L., Monroe, K. B., & Cox, J. L. (2004). The price is unfair! A conceptual framework of price fairness perceptions. Journal of Marketing, 68(4), 1–15.CrossRef
Metadata
Title
Reinforcement Learning Approach for Dynamic Pricing
Authors
Maksim Balashov
Anton Kiselev
Alena Kuryleva
Copyright Year
2021
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-030-59959-1_8