Skip to main content
Log in

Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values

  • Research Article
  • Published:
Journal of Revenue and Pricing Management Aims and scope

Abstract

Forecasting demand and understanding sales drivers are one of the most important tasks in retail analytics. However, traditionally, linear models and/or models with a small number of predictors have been predominantly used in sales modeling. Taking into account that real-world demand is naturally determined by complex substitution and complementation patterns among a large number of interrelated SKUs, nonlinear effects of prices, promotions, seasonality, as well as many other factors, their lagged values, and interactions, a realistic model has to be able to account for all that. We propose a conceptual model for sales modeling based on standard POS data available to any retailer and generate almost 500 potentially useful predictors of a focal SKU’s sales accordingly. In our comparison of three classes of models, Gradient Boosting Machines outperformed Random Forests and Elastic nets. By using interpretable machine learning methods, we came up with actionable insights related to the importance of various groups of predictors from the conceptual model, as well as demonstrated how helpful it can be for marketing managers to decompose predictions into the effects of individual regressors by using an approximation of Shapley values for feature attribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.dunnhumby.com/careers/engineering/sourcefiles.

References

  • Ailawadi, K.L., B.A. Harlam, J. Cesar, and D. Trounce. 2006. Promotion profitability for a retailer: The role of promotion, brand, category, and store characteristics. Journal of Marketing Research 43: 518–535.

    Article  Google Scholar 

  • Ailawadi, K.L., B.A. Harlam, J. César, and D. Trounce. 2007. Practice prize Report: Quantifying and improving promotion effectiveness at CVS. Marketing Science 26: 566–575.

    Article  Google Scholar 

  • Ali, Ö.G., S. Sayin, T. Van Woensel, and J. Fransoo. 2009. SKU demand forecasting in the presence of promotions. Expert Systems with Applications 36: 12340–12348.

    Article  Google Scholar 

  • Andrews, R.L., I.S. Currim, P. Leeflang, and J. Lim. 2008. Estimating the SCAN*PRO model of store sales: HB, FM or just OLS? International Journal of Research in Marketing 25: 22–33.

    Article  Google Scholar 

  • Bajari, P., D. Nekipelov, S.P. Ryan, and M. Yang. 2015. Machine learning methods for demand estimation. The American Economic Review 105: 481–485.

    Article  Google Scholar 

  • Bohanec, M., M.K. Borštnar, and M. Robnik-Šikonja. 2017. Explaining machine learning models in sales predictions. Expert Systems with Applications 71: 416–428.

    Article  Google Scholar 

  • Bradlow, E.T., M. Gangwar, P. Kopalle, and S. Voleti. 2017. The role of big data and predictive analytics in retailing. Journal of Retailing 93: 79–95.

    Article  Google Scholar 

  • Breiman, L. 1984. Classification and regression trees. Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

  • Breiman, L. 2001. Random forests. Machine Learning 45: 5–32.

    Article  Google Scholar 

  • Einav, L., and J. Levin. 2014. Economics in the age of big data. Science (80-) 346: 1243089.

    Article  Google Scholar 

  • Ferreira, K.J., B.H.A. Lee, and D. Simchi-Levi. 2015. Analytics for an online retailer: Demand forecasting and price optimization. Manufacturing & Service Operations Management 18: 69–88.

    Article  Google Scholar 

  • Friedman, J., T. Hastie, and R. Tibshirani. 2010. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33: 1.

    Article  Google Scholar 

  • Friedman, J.H. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics 29: 1189–1232.

    Article  Google Scholar 

  • Gedenk, K. 2018. Retailer promotions. In Handbook of Research on Retailing, ed. K. Gedenk. Cheltenham: EdwardElgar Publishing.

    Google Scholar 

  • Haupt, H., K. Kagerer, and W.J. Steiner. 2014. Smooth quantile-based modeling of brand sales, price and promotional effects from retail scanner panels. Journal of Applied Economics 29: 1007–1028.

    Article  Google Scholar 

  • Lundberg, S.M., and S.-I. Lee. 2017. A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems. pp. 4765–4774.

  • Ma, S., and R. Fildes. 2017. A retail store SKU promotions optimization model for category multi-period profit maximization. European Journal of Operational Research 260: 680–692.

    Article  Google Scholar 

  • Ma, S., R. Fildes, and T. Huang. 2016. Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information. European Journal of Operational Research 249: 245–257.

    Article  Google Scholar 

  • Molnar, C., 2018. Interpretable machine learning: A guide for making black box models explainable. Leanpub.

  • Ozhegov, E., and D. Teterina. 2018. The ensemble method for censored demand prediction. High. Sch. Econ. Res. Pap. No. WP BRP 200.

  • Rifkin, R., and A. Klautau. 2004. In defense of one-vs-all classification. Journal of Machine Learning Research 5: 101–141.

    Google Scholar 

  • Štrumbelj, E., and I. Kononenko. 2014. Explaining prediction models and individual. predictions with feature contributions. Knowledge and Information Systems 41: 647–665.

    Article  Google Scholar 

  • Sun, Z.-L., T.-M. Choi, K.-F. Au, and Y. Yu. 2008. Sales forecasting using extreme learning machine with applications in fashion retailing. Decision Support Systems 46: 411–419.

    Article  Google Scholar 

  • Van Heerde, H.J., P.S.H. Leeflang, and D.R. Wittink. 2002. How promotions work: SCAN* PRO-based evolutionary model building. Schmalenbach Business Review 54: 198–220.

    Article  Google Scholar 

  • Van Heerde, H.J., P.S.H. Leeflang, and D.R. Wittink. 2004. Decomposing the sales promotion bump with store data. Marketing Science 23: 317–334.

    Article  Google Scholar 

  • Varian, H.R. 2014. Big data: New tricks for econometrics. Journal of Economic Perspective 28: 3–27.

    Article  Google Scholar 

  • Wittink, D.R., M.J. Addona, W.J. Hawkes, and J.C. Porter. 1988. SCAN*PRO: The estimation, validation and use of promotional effects based on scanner data. Ithaca: Cornell University.

    Google Scholar 

  • Yang, D., and A.N. Zhang. 2018. Forecast UPC-level FMCG demand, Part IV: statistical ensemble. In: 2018 IEEE International Conference on Big Data (Big Data). pp. 3180–3185.

Download references

Acknowledgements

The research was supported by the Russian Science Foundation (Project № 18-71-00119).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeny A. Antipov.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Antipov, E.A., Pokryshevskaya, E.B. Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values. J Revenue Pricing Manag 19, 355–364 (2020). https://doi.org/10.1057/s41272-020-00236-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/s41272-020-00236-4

Keywords

Navigation