Top

International Journal of Data Science and Analytics

Published in:

01-03-2022 | Review

A survey on machine learning methods for churn prediction

Authors: Louis Geiler, Séverine Affeldt, Mohamed Nadif

Published in: International Journal of Data Science and Analytics | Issue 3/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The diversity and specificities of today’s businesses have leveraged a wide range of prediction techniques. In particular, churn prediction is a major economic concern for many companies. The purpose of this study is to draw general guidelines from a benchmark of supervised machine learning techniques in association with widely used data sampling approaches on publicly available datasets in the context of churn prediction. Choosing a priori the most appropriate sampling method as well as the most suitable classification model is not trivial, as it strongly depends on the data intrinsic characteristics. In this paper, we study the behavior of eleven supervised and semi-supervised learning methods and seven sampling approaches on sixteen diverse and publicly available churn-like datasets. Our evaluations, reported in terms of the Area Under the Curve (AUC) metric, explore the influence of sampling approaches and data characteristics on the performance of the studied learning methods. Besides, we propose Nemenyi test and Correspondence Analysis as means of comparison and visualization of the association between classification algorithms, sampling methods and datasets. Most importantly, our experiments lead to a practical recommendation for a prediction pipeline based on an ensemble approach. Our proposal can be successfully applied to a wide range of churn-like datasets.

next article RASCL: a randomised approach to subspace clusters

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

In a binary or churn prediction context, \(G=2\) and we consider the two classes \(+\),− that correspond to the churn and non-churn classes, respectively.

Before fitting a model, categorical variables are converted to their numerical representation through a dummification process where each category becomes a binary variable.

In our experiments, we consider both the linear SVM and the SVM-rbf, which is a kernel SVM using the Radial basis function, following Amnueypornsakul et al. results [6]

GEV-NN, iForest and DevNet being specifically designed for imbalance binary classification or anomaly detection, these approaches are only evaluated without sampling.

Abdillah, M.F., Nasri, J., Aditsania, A.: Using deep learning to predict customer churn in a mobile telecomunication network. eProc. Eng. 3(2) (2016)

Ahmed, M., Afzal, H., Siddiqi, I., et al.: Exploring nested ensemble learners using overproduction and choose approach for churn prediction in telecom industry. Neural Comput. Appl. (2018). https://doi.org/10.1007/s00521-018-3678-8CrossRef

Ahmed, M., Siddiqi, I., Afzal, H., et al.: MCS: Multiple classifier system to predict the churners in the telecom industry. In: 2017 Intelligent Systems Conference, IntelliSys 2017, pp. 678–683. https://doi.org/10.1109/IntelliSys.2017.8324367 (2018b)

Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: European Conference on Machine Learning, pp. 39–50. Springer (2004)

Alam, S., Sonbhadra, S.K., Agarwal, S., et al.: One-class support vector classifiers: a survey. Knowl. Based Syst. 196(105), 754 (2020)

Amnueypornsakul, B., Bhat, S., Chinprutthiwong, P.: Predicting attrition along the way: the UIUC model. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 55–59. https://doi.org/10.3115/v1/w14-4110 (2015)

Anderson, E.W., Sullivan, M.W.: The antecedents and consequences of customer satisfaction for firms. Mark. Sci. 12(2), 125–143 (1993)CrossRef

Batista, G.E., Bazzan, A.L., Monard, M.C., et al.: Balancing training data for automated annotation of keywords: a case study. In: WOB, pp. 10–18 (2003)

Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl 6(1), 20–29 (2004)CrossRef

10.

Batuwita, R., Palade, V.: Efficient resampling methods for training support vector machines with imbalanced datasets. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2010)

11.

Benczúr, A.A., Csalogány, K., Lukács, L., et al.: Semi-supervised learning: a comparative study for web spam and telephone user churn. In: In Graph Labeling Workshop in conjunction with ECML/PKDD, Citeseer (2007)

12.

Benoit, D.F., Van den Poel, D.: Improving customer retention in financial services using kinship network information. Expert Syst. Appl. 39(13), 11,435-11,442 (2012)CrossRef

13.

Bermejo, P., Gámez, J.A., Puerta, J.M.: Improving the performance of naive bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Syst. Appl. 38(3), 2072–2080 (2011)CrossRef

14.

Bhattacharya, C.: When customers are members: customer retention in paid membership contexts. J. Acad. Mark. Sci. 26(1), 31–44 (1998)CrossRef

15.

Błaszczyński, J., Stefanowski, J.: Local data characteristics in learning classifiers from imbalanced data. In: Advances in Data Analysis with Computational Intelligence Methods, pp. 51–85. Springer (2018)

16.

Bolton, R.N.: A dynamic model of the duration of the customer’s relationship with a continuous service provider: the role of satisfaction. Market. Sci. 17(1), 45–65 (1998)CrossRef

17.

Bolton, R.N., Bronkhorst, T.M.: The relationship between customer complaints to the firm and subsequent exit behavior. ACR North Am. Adv. 22, 94–100 (1995)

18.

Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. 49(2), 1–50 (2016). https://doi.org/10.1145/2907070CrossRef

19.

Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH

20.

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MATHCrossRef

21.

Breiman, L., Spector, P.: Submodel selection and evaluation in regression: the x-random case. Int. Stat. Rev. 60(3), 291–319 (1992)CrossRef

22.

Breiman, L., Friedman, J.H., Olshen, R.A., et al.: Classification and Regression Trees. Wadsworth, Belmont (1984)MATH

23.

Breunig, M.M., Kriegel, H.P., Ng, R.T., et al.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000). https://doi.org/10.1145/335191.335388CrossRef

24.

Burez, J., Van den Poel, D.: Handling class imbalance in customer churn prediction. Expert Syst. Appl. 36(3), 4626–4636 (2009)CrossRef

25.

Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989)MathSciNetMATHCrossRef

26.

Burrus, C.S., Barreto, J., Selesnick, I.W.: Iterative reweighted least-squares design of fir filters. IEEE Trans. Signal Process. 42(11), 2926–2936 (1994)CrossRef

27.

Cabral, G.G., Oliveira, A.: One-class classification for heart disease diagnosis. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC) pp. 2551–2556 (2014)

28.

Castanedo, F., Valverde, G., Zaratiegui, J., et al.: Using deep learning to predict customer churn in a mobile telecommunication network (2014)

29.

Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., et al.: A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing 408, 189–215 (2020)CrossRef

30.

Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (2009). https://doi.org/10.1145/1541880.1541882CrossRef

31.

Chawla, N.V., Bowyer, K.W., Hall, L.O., et al.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)MATHCrossRef

32.

Chen, C., Liaw, A., Breiman, L., et al.: Using random forest to learn imbalanced data. Univ. Calif. Berkeley 110(1–12), 24 (2004)

33.

Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

34.

Chen, Y., Xie, X., Lin, S.D., et al.: Wsdm cup 2018: music recommendation and churn prediction. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 8–9. ACM (2018)

35.

Chowdhury, A., Alspector, J.: Data duplication: an imbalance problem? In: ICML’2003 Workshop on Learning from Imbalanced Data Sets (II), Washington, DC (2003)

36.

Clemente, M., Giner-Bosch, V., San Matías, S.: Assessing classification methods for churn prediction by composite indicators. Manuscript, Dept of Applied Statistics, OR & Quality, UniversitatPolitècnica de València, Camino de Vera s/n 46022 (2010)

37.

Cooray, K.: Generalized Gumbel distribution. J. Appl. Stat. 37(1), 171–179 (2010)MathSciNetMATHCrossRef

38.

Coussement, K., De Bock, K.W.: Customer churn prediction in the online gambling industry: the beneficial effect of ensemble learning. J. Bus. Res. 66(9), 1629–1636 (2013)CrossRef

39.

Coussement, K., Van den Poel, D.: Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst. Appl. 34(1), 313–327 (2008)CrossRef

40.

Coussement, K., Benoit, D.F., Van den Poel, D.: Improved marketing decision making in a customer churn prediction context using generalized additive models. Expert Syst. Appl. 37(3), 2132–2143 (2010)CrossRef

41.

De Caigny, A., Coussement, K., De Bock, K.W.: A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur. J. Oper. Res. 269(2), 760–772 (2018). https://doi.org/10.1016/j.ejor.2018.02.009MathSciNetCrossRefMATH

42.

De Caigny, A., Coussement, K., De Bock, K.W., et al.: Incorporating textual information in customer churn prediction models based on a convolutional neural network. Int. J. Forecast. 36(4), 1563–1578 (2020). https://doi.org/10.1016/j.ijforecast.2019.03.029CrossRef

43.

Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH

44.

Denil, M., Trappenberg, T.: Overlap versus imbalance. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6085 LNAI:220–231. https://doi.org/10.1007/978-3-642-13059-5_22 (2010)

45.

Deville, J.C., Tillé, Y.: Efficient balanced sampling: the cube method. Biometrika 91(4), 893–912 (2004)MathSciNetMATHCrossRef

46.

Dingli, A., Marmara, V., Fournier, N.S.: Comparison of deep learning algorithms to predict customer churn within a local retail industry. Int. J. Mach. Learn. Comput. 7(5), 128–132 (2017)CrossRef

47.

Domingos, P. Metacost: A general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)

48.

Drummond, C., Holte, R.C., et al.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II, Citeseer, pp. 1–8 (2003)

49.

Dubey, H., Pudi, V.: Class based weighted K-Nearest neighbor over imbalance dataset. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7819 LNAI(PART 2):305–316. https://doi.org/10.1007/978-3-642-37456-2_26 (2013)

50.

Effendy, V., Baizal, Z.A., et al.: Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest. In: 2014 2nd International Conference on Information and Communication Technology (ICoICT), pp. 325–330. IEEE (2014)

51.

Fernández, A., García, S., Herrera, F., et al.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018). https://doi.org/10.1613/jair.1.11192MathSciNetCrossRefMATH

52.

Friedman, J., Hastie, T., Tibshirani, R.: The elements of statistical learning, vol 1. Springer Series in Statistics, New York (2001)

53.

Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manag. 35(2), 137–144 (2015)CrossRef

54.

Ganesan, S.: Determinants of long-term orientation in buyer–seller relationships. J. Mark. 58(2), 1–19 (1994)CrossRef

55.

García, D.L., Nebot, À., Vellido, A.: Intelligent data analysis approaches to churn as a business problem: a survey. Knowl. Inf. Syst. 51(3), 719–774 (2017)CrossRef

56.

García, V., Mollineda, R.A., Sánchez, J.S.: On the k-nn performance in a challenging scenario of imbalance and overlapping. Pattern Anal. Appl. 11(3), 269–280 (2008)MathSciNetCrossRef

57.

García, V., Sánchez, J.S., Mollineda, R.A.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl. Based Syst. 25(1), 13–21 (2012)CrossRef

58.

Gregory, B.: Predicting customer churn: extreme gradient boosting with temporal data. arXiv preprint arXiv:1802.03396 (2018)

59.

Günther, C.C., Tvete, I.F., Aas, K., et al.: Modelling and predicting customer churn from an insurance company. Scand. Actuar. J. 1, 58–71 (2014)MathSciNetMATHCrossRef

60.

Gupta, S., Lehmann, D.R., Stuart, J.A.: Valuing customers. J. Mark. Res. 41(1), 7–18 (2004)CrossRef

61.

Guyon, I., Gunn, S., Nikravesh, M., et al.: Feature Extraction: Foundations and Applications, vol. 207. Springer, Berlin (2008)MATH

62.

Guyon, I., Lemaire, V., Boullé, M., et al.: Analysis of the kdd cup 2009: fast scoring on a large orange customer database. In: Proceedings of the 2009 International Conference on KDD-Cup 2009, vol. 7, pp. 1–22. JMLR. org (2009)

63.

Hadden, J., Tiwari, A., Roy, R., et al.: Churn prediction: does technology matter. Int. J. Intell. Technol. 1(2), 104–110 (2006)

64.

Haixiang, G., Yijing, L., Shang, J., et al.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017). https://doi.org/10.1016/j.eswa.2016.12.035CrossRef

65.

Han, H., Wang, W.Y., Mao, B.H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, pp. 878–887. Springer (2005)

66.

Hand, D.J., Yu, K.: Idiot’s Bayes—not so stupid after all? Int. Stat. Rev. 69(3), 385–398 (2001). https://doi.org/10.1111/j.1751-5823.2001.tb00465.xCrossRefMATH

67.

Hart, P.: The condensed nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory 14(3), 515–516 (1968)CrossRef

68.

He, H., Ma, Y.: Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley, New York (2013)MATHCrossRef

69.

He, H., Bai, Y., Garcia, E., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008 (IEEE World Congress on Computational Intelligence), vol. 3, pp. 1322– 1328 (2008)

70.

Hitt, L.M., Frei, F.X.: Do better customers utilize electronic distribution channels? The case of pc banking. Manag. Sci. 48(6), 732–748 (2002)CrossRef

71.

Holte, R.C., Acker, L., Porter, B.W., et al.: Concept learning and the problem of small disjuncts. In: IJCAI, Citeseer, pp. 813–818 (1989)

72.

Hosein, P., Sewdhan, G., Jailal, A.: Soft-churn: optimal switching between prepaid data subscriptions on e-sim support smartphones. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–6. IEEE (2021)

73.

Huang, B., Kechadi, M.T., Buckley, B.: Customer churn prediction in telecommunications. Expert Syst. Appl. 39(1), 1414–1425 (2012). https://doi.org/10.1016/j.eswa.2011.08.024CrossRef

74.

Hudaib, A., Dannoun, R., Harfoushi, O., et al.: Hybrid data mining models for predicting customer churn. Int. J. Commun. Netw. Syst. Sci. 8(05), 91 (2015)

75.

John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)

76.

Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: Use of psoaann based one-class classification. In: Proceedings of the International Conference on Informatics and Analytics. Association for Computing Machinery, New York ICIA-16. https://doi.org/10.1145/2980258.2980319 (2016)

77.

Kawale, J., Pal, A., Srivastava, J.: Churn prediction in MMORPGs: a social influence based approach. In: 2009 International Conference on Computational Science and Engineering, pp. 423–428. IEEE (2009)

78.

Kim, Y.: Toward a successful CRM: variable selection, sampling, and ensemble. Decis. Support Syst. 41(2), 542–553 (2006)CrossRef

79.

King, G., Zeng, L.: Logistic regression in rare events data. Polit. Anal. 9(2), 137–163 (2001)CrossRef

80.

Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Montreal, Canada, pp. 1137–1145 (1995)

81.

Kong, J., Kowalczyk, W., Menzel, S., et al.: Improving imbalanced classification by anomaly detection. In: Bäck, T., Preuss, M., Deutz, A., et al. (eds.) Parallel Problem Solving from Nature, vol. XVI, pp. 512–523. Springer, Cham (2020)CrossRef

82.

Kumar, D.A., Ravi, V., et al.: Predicting credit card customer churn in banks using data mining. Int. J. Data Anal. Tech. Strateg. 1(1), 4–28 (2008)CrossRef

83.

Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: Conference on Artificial Intelligence in Medicine in Europe, pp. 63–66. Springer (2001)

84.

Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. J. Mark. Res. 43(2), 276–286 (2006)CrossRef

85.

Leung, C.K., Pazdor, A.G., Souza, J.: Explainable artificial intelligence for data science on customer churn. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)

86.

Li, W., Gao, M., Li, H., et al.: Dropout prediction in MOOCs using behavior features and multi-view semi-supervised learning. Proceedings of the International Joint Conference on Neural Networks, pp. 3130–3137. https://doi.org/10.1109/IJCNN.2016.7727598 (2016)

87.

Ling, C.X., Li, C.: Data mining for direct marketing: problems and solutions. In: Kdd, pp. 73–79 (1998)

88.

Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (2012). https://doi.org/10.1145/2133360.2133363CrossRef

89.

López, V., Fernández, A., Moreno-Torres, J.G., et al.: Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Syst. Appl. 39(7), 6585–6608 (2012)CrossRef

90.

López, V., Fernández, A., García, S., et al.: An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 250, 113–141 (2013)CrossRef

91.

Maxham, J.G.: Service recovery’s influence on consumer satisfaction, positive word-of-mouth, and purchase intentions. J. Bus. Res. 54(1), 11–24 (2001)CrossRef

92.

McKinley Stacker, I.: Ibm waston analytics. Sample data: Hr employee attrition and performance [data file] (2015)

93.

Mittal, B., Lassar, W.M.: Why do customers switch? the dynamics of satisfaction versus loyalty. J. Serv. Mark. 12(3), 177–194 (1998)CrossRef

94.

Mittal, V., Kamakura, W.A.: Satisfaction, repurchase intent, and repurchase behavior: investigating the moderating effect of customer characteristics. J. Mark. Res. 38(1), 131–142 (2001)CrossRef

95.

Mozer, M.C., Wolniewicz, R., Grimes, D.B., et al.: Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans. Neural Netw. 11(3), 690–696 (2000)CrossRef

96.

Munkhdalai, L., Munkhdalai, T., Park, K.H., et al.: An end-to-end adaptive input selection with dynamic weights for forecasting multivariate time series. IEEE Access 7, 99,099-99,114 (2019)CrossRef

97.

Munkhdalai, L., Munkhdalai, T., Ryu, K.H.: Gev-nn: a deep neural network architecture for class imbalance problem in binary classification. Knowl. Based Syst. 194(105), 534 (2020)

98.

Napierała, K., Stefanowski, J., Wilk, S.: Learning from imbalanced data in presence of noisy and borderline examples. In: International Conference on Rough Sets and Current Trends in Computing, pp. 158–167. Springer (2010)

99.

Neslin, S.A., Gupta, S., Kamakura, W., et al.: Defection detection: measuring and understanding the predictive accuracy of customer churn models. J. Mark. Res. 43(2), 204–211 (2006)CrossRef

100.

Nguyen, H.M., Cooper, E.W., Kamei, K.: Borderline over-sampling for imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradig. 3(1), 4–21 (2011)CrossRef

101.

Nguyen, N., LeBlanc, G.: The mediating role of corporate image on customers’ retention decisions: an investigation in financial services. Int. J. Bank Market. 16(2), 52–65 (1998)CrossRef

102.

Owen, A.B.: Infinitely imbalanced logistic regression. J. Mach. Learn. Res. 8(Apr), 761–773 (2007)MathSciNetMATH

103.

Pang, G., Xu, H., Cao, L., et al.: Selective value coupling learning for detecting outliers in high-dimensional categorical data. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 807–816 (2017)

104.

Pang, G., Shen, C., van den Hengel, A.: Deep anomaly detection with deviation networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 353–362 (2019)

105.

Pang, G., Shen, C., Cao, L., et al.: Deep learning for anomaly detection: a review. ACM Comput. Surv. (2021). https://doi.org/10.1145/3439950CrossRef

106.

Paulin, M., Perrien, J., Ferguson, R.J., et al.: Relational norms and client retention: external effectiveness of commercial banking in Canada and Mexico. Int. J. Bank Market. 16(1), 24–31 (1998)CrossRef

107.

Reichheld, F.F., Sasser, W.E.: Zero defections: quality comes to services. Harv. Bus. Rev. 68(5), 105–111 (1990)

108.

Reinartz, W.J., Kumar, V.: The impact of customer relationship characteristics on profitable lifetime duration. J. Mark. 67(1), 77–99 (2003)CrossRef

109.

Rennie, J.D.: Improving multi-class text classification with Naive bayes. Technical Report AITR, vol. 4 (2001)

110.

Risselada, H., Verhoef, P.C., Bijmolt, T.H.: Staying power of churn prediction models. J. Interact. Mark. 24(3), 198–208 (2010)CrossRef

111.

Ruff, L., Kauffmann, J.R., Vandermeulen, R.A., et al.: A unifying review of deep and shallow anomaly detection. In: Proceedings of the IEEE (2021)

112.

Ruisen, L., Songyi, D., Chen, W., et al.: Bagging of xgboost classifiers with random under-sampling and tomek link for noisy label-imbalanced data. In: IOP Conference Series: Materials Science and Engineering, p 012004. IOP Publishing (2018)

113.

Salas-Eljatib, C., Fuentes-Ramirez, A., Gregoire, T.G., et al.: A study on the effects of unbalanced data when fitting logistic regression models in ecology. Ecol. Ind. 85, 502–508 (2018)CrossRef

114.

Saradhi, V.V., Palshikar, G.K.: Employee churn prediction. Expert Syst. Appl. 38(3), 1999–2006 (2011)CrossRef

115.

Schölkopf, B., Williamson, R., Smola, A., et al.: Support Vector Method for Novelty Detection, pp. 582–588. MIT Press, Cambridge (1999)

116.

Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., et al.: An empirical study of the classification performance of learners on imbalanced and noisy software quality data. Inf. Sci. 259, 571–595 (2014)CrossRef

117.

Seymen, O.F., Dogan, O., Hiziroglu, A.: Customer churn prediction using deep learning. In: International Conference on Soft Computing and Pattern Recognition, pp. 520–529. Springer (2020)

118.

Siber, R.: Combating the churn phenomenon-as the problem of customer defection increases, carriers are having to find new strategies for keeping subscribers happy. Telecommun. Int. Edn. 31(10), 77–81 (1997)

119.

Śniegula, A., Poniszewska-Marańda, A., Popović, M.: Study of machine learning methods for customer churn prediction in telecommunication company. In: Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services, pp. 640–644 (2019)

120.

Stefanowski, J.: Dealing with data difficulty factors while learning from imbalanced data. In: Challenges in Computational Statistics and Data Mining. pp. 333–363. Springer (2016)

121.

Taha, A., Hadi, A.S.: Anomaly detection methods for categorical data: a review. ACM Comput. Surv. (2019). https://doi.org/10.1145/3312739CrossRef

122.

Tan, F., Wei, Z., He, J., et al.: A blended deep learning approach for predicting user intended actions. Proceedings—IEEE International Conference on Data Mining, ICDM 2018, pp. 487–496. https://doi.org/10.1109/ICDM.2018.00064 (2018)

123.

Tan, S.: Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Syst. Appl. 28(4), 667–671 (2005)CrossRef

124.

Tang, L., Thomas, L., Fletcher, M., et al.: Assessing the impact of derived behavior information on customer attrition in the financial service industry. Eur. J. Oper. Res. 236(2), 624–633 (2014)MathSciNetMATHCrossRef

125.

Tax, D.M.J., Duin, R.P.W.: Support vector domain description. Pattern Recogn. Lett. 20(11–13), 1191–1199 (1999). https://doi.org/10.1016/S0167-8655(99)00087-2CrossRef

126.

Tian, J., Gu, H., Liu, W.: Imbalanced classification using support vector machine ensemble. Neural Comput. Appl. 20(2), 203–209 (2011)CrossRef

127.

Tomek, I.: Tomek link: two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)MathSciNetMATH

128.

Umayaparvathi, V., Iyakutti, K.: A survey on customer churn prediction in telecom industry: datasets, methods and metrics. Int. Res. J. Eng. Technol. 3, 2395 (2016)

129.

Umayaparvathi, V., Iyakutti, K.: Automated feature selection and churn prediction using deep learning models. Int. Res. J. Eng. Technol. 4(3), 1846–1854 (2017)

130.

Vafeiadis, T., Diamantaras, K.I., Sarigiannidis, G., et al.: A comparison of machine learning techniques for customer churn prediction. Simul. Model. Pract. Theory 55, 1–9 (2015)CrossRef

131.

Van Hulse, J., Khoshgoftaar, T.M., Napolitano, A., et al.: Feature selection with high-dimensional imbalanced data. In: 2009 IEEE International Conference on Data Mining Workshops, pp. 507–514. IEEE (2009)

132.

Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)MATH

133.

Varki, S., Colgate, M.: The role of price perceptions in an integrated model of behavioral intentions. J. Serv. Res. 3(3), 232–240 (2001)CrossRef

134.

Villa-Pérez, M.E., Álvarez-Carmona, M.Á., Loyola-González, O., et al.: Semi-supervised anomaly detection algorithms: a comparative summary and future research directions. Based Syst, Knowl (2021). https://doi.org/10.1016/j.knosys.2021.106878CrossRef

135.

Wang, S., Li, D., Song, X., et al.: A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Syst. Appl. 38(7), 8696–8702 (2011)CrossRef

136.

Van den Poel, D., Lariviere, B.: Customer attrition analysis for financial services using proportional hazard models. Eur. J. Oper. Res. 157(1), 196–217 (2004)MATHCrossRef

137.

Wang, S., Liu, W., Wu, J., et al.: Training deep neural networks on imbalanced data sets. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4368–4374. IEEE (2016)

138.

Wang, W., Yu, H., Miao, C.: Deep model for dropout prediction in MOOCs. ACM Int. Conf. Proc. Ser. Part F1306, 26–32 (2017). https://doi.org/10.1145/3126973.3126990CrossRef

139.

Weiss, G.M.: Mining with rarity: a unifying framework. ACM SIGKDD Explor. Newsl. 6(1), 7–19 (2004)CrossRef

140.

Weiss, G.M.: The impact of small disjuncts on classifier learning. In: Data Mining, pp. 193–226. Springer (2010)

141.

Weiss, G.M., Hirsh, H.: A quantitative study of small disjuncts. AAAI/IAAI 2000, 665–670 (2000)

142.

Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)MATHCrossRef

143.

Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC–2(3), 408–421 (1972)MathSciNetMATHCrossRef

144.

Xiao, J., Huang, L., Xie, L.: Cost-sensitive semi-supervised ensemble model for customer churn prediction. In: 2018 15th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE (2018)

145.

Xiao, Y., Wang, H., Xu, W., et al.: Robust one-class SVM for fault detection. Chemom. Intell. Lab. Syst. 151, 15–25 (2016). https://doi.org/10.1016/j.chemolab.2015.11.010CrossRef

146.

Xie, Y., Li, X.: Churn prediction with linear discriminant boosting algorithm. In: International Conference on Machine Learning and Cybernetics, pp. 228–233. IEEE (2008)

147.

Yang, C., Shi, X., Jie, L., et al.: I know you’ll be back: interpretable new user clustering and churn prediction on a mobile social application. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 914–922 (2018)

148.

Yang, Z., Peterson, R.T.: Customer perceived value, satisfaction, and loyalty: the role of switching costs. Psychol. Market. 21(10), 799–822 (2004)CrossRef

149.

Yin, L., Ge, Y., Xiao, K., et al.: Feature selection for high-dimensional imbalanced data. Neurocomputing 105, 3–11 (2013). https://doi.org/10.1016/j.neucom.2012.04.039CrossRef

150.

Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 204–213 (2001)

151.

Zadrozny, B., Langford, J., Abe, N.: Cost-sensitive learning by cost-proportionate example weighting. In: Third IEEE International Conference on Data Mining, pp. 435–442. IEEE (2003)

152.

Zeithaml, V.A., Berry, L.L., Parasuraman, A.: The behavioral consequences of service quality. J. Mark. 60(2), 31–46 (1996)CrossRef

153.

Zhao, Z., Peng, H., Lan, C., et al.: Imbalance learning for the prediction of n 6-methylation sites in MRNAS. BMC Genom. 19(1), 574 (2018)CrossRef

154.

Zhou, F., Yang, S., Fujita, H., et al.: Deep learning fault diagnosis method based on global optimization GAN for unbalanced data. Knowl. Based Syst. 187(104), 837 (2020)

155.

Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2005)CrossRef

156.

Zhu, Y., Zhou, L., Xie, C., et al.: Forecasting Smes’ credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach. Int. J. Prod. Econ. 211, 22–33 (2019). https://doi.org/10.1016/j.ijpe.2019.01.032CrossRef

157.

Zong, B., Song, Q., Min, M.R, et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)

Title: A survey on machine learning methods for churn prediction
Authors: Louis Geiler
Séverine Affeldt
Mohamed Nadif
Publication date: 01-03-2022
Publisher: Springer International Publishing
Published in: International Journal of Data Science and Analytics / Issue 3/2022
Print ISSN: 2364-415X
Electronic ISSN: 2364-4168
DOI: https://doi.org/10.1007/s41060-022-00312-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2022

Predicting and explaining employee turnover intention

Overfitting measurement of convolutional neural networks using trained network weights

RASCL: a randomised approach to subspace clusters

Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression

Path signature-based phase space reconstruction for stock trend prediction

Premium Partner