Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification

Idris, Adnan; Khan, Asifullah; Lee, Yeon Soo

doi:10.1007/s10489-013-0440-x

Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification

Published: 21 April 2013

Volume 39, pages 659–672, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Adnan Idris^1,2,
Asifullah Khan¹ &
Yeon Soo Lee³

1536 Accesses
38 Citations
Explore all metrics

Abstract

Churn prediction in telecom has recently gained substantial interest of stakeholders because of associated revenue losses.

Predicting telecom churners, is a challenging problem due to the enormous nature of the telecom datasets. In this regard, we propose an intelligent churn prediction system for telecom by employing efficient feature extraction technique and ensemble method. We have used Random Forest, Rotation Forest, RotBoost and DECORATE ensembles in combination with minimum redundancy and maximum relevance (mRMR), Fisher’s ratio and F-score methods to model the telecom churn prediction problem. We have observed that mRMR method returns most explanatory features compared to Fisher’s ratio and F-score, which significantly reduces the computations and help ensembles in attaining improved performance. In comparison to Random Forest, Rotation Forest and DECORATE, RotBoost in combination with mRMR features attains better prediction performance on the standard telecom datasets. The better performance of RotBoost ensemble is largely attributed to the rotation of feature space, which enables the base classifier to learn different aspects of the churners and non-churners. Moreover, the Adaboosting process in RotBoost also contributes in achieving higher prediction accuracy by handling hard instances. The performance evaluation is conducted on standard telecom datasets using AUC, sensitivity and specificity based measures. Simulation results reveal that the proposed approach based on RotBoost in combination with mRMR features (CP-MRB) is effective in handling high dimensionality of the telecom datasets. CP-MRB offers higher accuracy in predicting churners and thus is quite prospective in modeling the challenging problems of customer churn prediction in telecom.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on telecom customer churn prediction based on ensemble learning

Article 14 September 2022

Yajun Liu, Jingjing Fan, … Zehua Song

Handling Imbalanced Data in Churn Prediction Using RUSBoost and Feature Selection (Case Study: PT.Telekomunikasi Indonesia Regional 7)

Customer churn prediction system: a machine learning approach

Article 14 February 2021

Praveen Lalwani, Manas Kumar Mishra, … Pratyush Sethi

References

Reinartz WJ, Kumar V (2003) The impact of customer relationship characteristics on profitable lifetime duration. J Mark 67(1):77
Article Google Scholar
Lee T-S, Chiu C-C, Chou Y-C, Lu C-J (2004) Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput Stat Data Anal 50(4):1113–1130
Article MathSciNet Google Scholar
Ruta D, Nauck D, Azvine B (2006) K nearest sequence method and its application to churn prediction. In: Intelligent data engineering and automated learning—IDEAL 2006. Lecture notes in computer sciences, vol 4224, pp 207–215
Chapter Google Scholar
Khan A, Khan MF, Choi T-S (2008) Proximity base GPCRs prediction in transform domain. Biochem Biophys Res Commun 371(3):411–415
Article Google Scholar
Tan S (2006) An effective refinement strategy for KNN text classifiers. Expert Syst Appl 30(2):290–298
Article Google Scholar
Zhao L, Wang L, Xu Q (2012) Data stream classification with artificial endocrine system. Appl Intell 37(3):390–404
Article Google Scholar
Zhang Y, Qi J, Shu H, Cao J (2007) A hybrid KNN-LR classifier and its application in customer churn prediction. In: IEEE international conference on systems, man and cybernetics, pp 3265–3269
Google Scholar
Mozer MC, Wolniewicz R, Grimes DB, Johnson E, Kaushansky H (2000) Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans Neural Netw 11(3):690–696
Article Google Scholar
Kim Y (2006) Toward a successful CRM: variable selection, sampling, and ensemble. Decis Support Syst 41(2):542–553
Article Google Scholar
Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Mark Res 43(2):276–286
Article Google Scholar
Bose I, Chen X (2009) Hybrid models using unsupervised clustering for prediction of customer churn. J Organ Comput Electron Commer 19(2):133–151
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. In: MCS’00 proceedings of the first international workshop on multiple classifier systems. Springer, London, pp 1–15
Chapter Google Scholar
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting and variants. Mach Learn 36(2):105–139
Article Google Scholar
Wang C-W, You W-H (2013) Boosting-SVM: effective learning with reduced data dimension. Appl Intell. doi:10.1007/s10489-013-0425-9
Google Scholar
Verikas A, Gelzinis A, Bacauskiene M (2011) Mining data with random forests: a survey and results of new tests. Pattern Recognit 44(2):330–349
Article Google Scholar
Xie Y, Li X, Ngai EWT, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36(3):5445–5449
Article Google Scholar
Rodriguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630
Article Google Scholar
Zhang C-X, Zhang J-S (2008) RotBoost: a technique for combining rotation forest and AdaBoost. Pattern Recognit Lett 29(10):1524–1536
Article Google Scholar
Bock KWD, Van den Poel D (2011) An empirical evaluation of rotation-based ensemble classifiers for customer churn prediction. Expert Syst Appl 38(10):12293–12301. doi:10.1016/j.eswa.2011.04.007
Article Google Scholar
Dietterich TG (2000) An experimental comparison of three methods for constructing ensemble of decision trees: bagging, boosting and randomization. Mach Learn 40(2):139–157
Article Google Scholar
Huang BQ, Kechadi TM, Buckley B, Kiernan G, Keogh E, Rashid T (2010) A new feature set with new window techniques for customer churn prediction in land-line telecommunications. Expert Syst Appl 37(5):3657–3665
Article Google Scholar
Huang B, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst Appl 39(1):1414–1425. doi:10.1016/j.eswa.2011.08.024
Article Google Scholar
Burez J, Van den Poel D (2009) Handling class imbalance in customer churn prediction. Expert Syst Appl 36(3):4626–4636. doi:10.1016/j.eswa.2008.05.027
Article Google Scholar
Owczarczuk M (2010) Churn models for prepaid customers in the cellular telecommunication industry using large data marts. Expert Syst Appl 37(6):4710–4712
Article Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Sorokina D (2009) Application of additive groves ensemble with multiple counts feature evaluation to KDD cup ’09 small data set. In: JMLR workshop and conference proceedings, Paris, France, June 28, 2009, vol 7, pp 101–109
Google Scholar
Vinh L, Lee S, Park Y-T, Auriol BD (2012) A novel feature selection method based on normalized mutual information. Appl Intell 37(1):100–120
Article Google Scholar
Li H, Wu X, Li Z, Wu G (2013) A relation extraction method of Chinese named entities based on location and semantic features. Appl Intell 38(1):1–15
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Zhang C-X, Wang G-W, Zhang J-S (2012) An empirical bias-variance analysis of DECORATE ensemble method at different training sample sizes. J Appl Stat 39(4):829–850
Article MathSciNet Google Scholar
Verbeke W, Dejaeger K, Martens D, Hur J, Baesens B (2012) New insights into churn prediction in the telecommunication sector: a profit driven data mining approach. Eur J Oper Res 218(1):211–229
Article Google Scholar
KDDCup 2009 challenge (2009) http://kddcup-orange.com
The Center for Customer Relationship Management, Duke University. http://www.fuqua.duke.edu/centers/ccrm/
Marquez-Vera C, Cano A, Romero C, Ventura S (2013) Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell 38(3):315–330
Article Google Scholar
Miller H, Clarke S, Lane S, Lonie A, Lazaridiz D, Petrovski S, Jones O (2009) Predicting customer behaviour: the University of Melbourne’s KDD Cup report. In: JMLR workshop and conference proceedings, Paris, France, June 28, 2009, vol 28, pp 45–55
Google Scholar
Busa-Fekete R, Kegl B (2009) Accelerating AdaBoost using UCB. In: JMLR workshop and conference proceedings, Paris, France, June 28, 2009, vol 7, pp 111–122
Google Scholar
Komoto K, Sugawara T, Tetu TI, Xuejuan X (2009) Stochastic gradient boosting. http://www.kddcup-orange.com/factsheet.php?id=23>

Download references

Acknowledgement

This work is supported by the Higher Education Commission of Pakistan (HEC) as per award No. 17-5-6(Ps6-002)/HEC/Sch/2010 and Korean National Research Foundation as per grant No. (NRF-2011-0006806).

Author information

Authors and Affiliations

Pattern Recognition Lab, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, Pakistan
Adnan Idris & Asifullah Khan
Department of Computer Sciences and Information Technology, University of Poonch Rawalakot, Rawalakot, Azad Jammu & Kashmir, Pakistan
Adnan Idris
Department of Biomedical Engineering, College of Medical Science, Catholic University of Daegu, Daegu, South Korea
Yeon Soo Lee

Authors

Adnan Idris
View author publications
You can also search for this author in PubMed Google Scholar
Asifullah Khan
View author publications
You can also search for this author in PubMed Google Scholar
Yeon Soo Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yeon Soo Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Idris, A., Khan, A. & Lee, Y.S. Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification. Appl Intell 39, 659–672 (2013). https://doi.org/10.1007/s10489-013-0440-x

Download citation

Published: 21 April 2013
Issue Date: October 2013
DOI: https://doi.org/10.1007/s10489-013-0440-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification

Abstract

Access this article

Similar content being viewed by others

Research on telecom customer churn prediction based on ensemble learning

Handling Imbalanced Data in Churn Prediction Using RUSBoost and Feature Selection (Case Study: PT.Telekomunikasi Indonesia Regional 7)

Customer churn prediction system: a machine learning approach

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification

Abstract

Access this article

Similar content being viewed by others

Research on telecom customer churn prediction based on ensemble learning

Handling Imbalanced Data in Churn Prediction Using RUSBoost and Feature Selection (Case Study: PT.Telekomunikasi Indonesia Regional 7)

Customer churn prediction system: a machine learning approach

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation