Building comprehensible customer churn prediction models with advanced rule induction techniques
Research highlights
► The literature has paid very limited attention to the comprehensibility and justifiability of customer churn prediction models. ► ALBA improves learning, resulting in comprehensible customer churn prediction models with increased performance. ► AntMiner+ results in accurate, comprehensible, and most importantly, justifiable customer churn prediction models.
Introduction
In recent decades we have witnessed an explosion of data. Valuable information is contained in this data, but is hidden in the vast collection of raw data. Data mining entails the overall process of extracting knowledge from this data. Data mining techniques have been successfully applied in many different domains. Well-known examples are breast-cancer detection in the biomedical sector, market basket analysis in the retail sector (Berry & Linoff, 2004), and credit scoring in the financial sector (Baesens et al., 2003). This paper however focuses on the use of data mining to predict customer churn.
Customer churn prediction models aim to detect customers with a high propensity to attrite. An accurate segmentation of the customer base allows a company to target the customers that are most likely to churn in a retention marketing campaign, which improves the efficient use of the limited resources for such a campaign. Customer retention is profitable to a company, because: (1) Attracting new clients costs five to six times more than customer retention (Athanassopoulos, 2000, Bhattacharya, 1998, Colgate and Danaher, 2000, Rasmusson, 1999). (2) Long-term customers generate higher profits, tend to be less sensitive to competitive marketing activities, become less costly to serve, and may provide new referrals through positive word-of-mouth, while dissatisfied customers might spread negative word-of-mouth (Colgate et al., 1996, Ganesh et al., 2000, Mizerski, 1982, Paulin et al., 1998, Reichheld, 1996, Stum and Thiry, 1991, Zeithaml et al., 1996). (3) Losing customers leads to opportunity costs because of reduced sales (Rust & Zahorik, 1993). A small improvement in customer retention hence can lead to a significant increase in profit (Van den Poel & Larivière, 2004). That is why both accurate and comprehensible churn prediction models are needed, in order to identify respectively the customers who are about to churn and their reasons to do so. As will be discussed in Section 2, many data mining techniques have already been tested on their churn predictive power. Much less attention has been paid however to the comprehensibility and the justifiability of the developed models. Note that churn prediction is just one of the applications of data mining for marketing, others include customer lifetime value prediction (Glady, Baesens, & Croux, 2009), frequent itemset mining (Agrawal & Srikant, 1994) and sales forecasting (Thomassey & Happiette, 2007).
In this paper we introduce the application of two novel data mining techniques for customer churn prediction. The first technique, AntMiner+, uses Ant Colony Optimization (ACO) to infer rules from data, and explicitly seeks to induce accurate, comprehensible, and intuitive classification rule-sets (Martens et al., 2007). So far AntMiner+ has been successfully applied to credit scoring (Martens et al., 2006), software mining (Vandecruys et al., 2008), audit mining (Martens, Bruynseels, Baesens, Willekens, & Vanthienen, 2008), and business/ICT alignment prediction (Cumps et al., 2009). An advantage of AntMiner+ is the possibility to incorporate domain knowledge (Martens et al., 2006), ensuring intuitive decision support models.
The second technique is an Active Learning Based Approach (ALBA) for support vector machine (SVM) rule extraction (Martens, Van Gestel, & Baesens, 2009). ALBA manipulates a dataset by changing the class labels of data instances by the SVM predicted labels, and by generating additional data instances close to the class boundaries. Applying simple rule induction techniques such as C4.5 or RIPPER on the manipulated dataset results in improved learning, and thus in a more accurate, but still comprehensible, rule-set.
The remainder of this paper is structured as follows. First, in Section 2, the domain of customer churn prediction modeling is introduced by means of a broad literature study. Then in Section 3, the workings of AntMiner+ and ALBA are briefly explained. In Section 4 both techniques are applied to predict customer churn, and the setup and results of a series of experiments are discussed. The final section concludes the paper.
Section snippets
Customer churn prediction modeling
Customer relationship management, and customer churn prediction in particular, have received a growing attention during the last decade. Table 1 provides an overview of the literature on the use of data mining techniques for customer churn prediction modeling. The table summarizes the applied modeling techniques, the characteristics of the assessed datasets, and the validation and evaluation of the results. Also included are preprocessing steps like sampling and variable selection.
In this paper
Advanced rule induction techniques: AntMiner+ and ALBA
As churn prediction models should be both accurate and comprehensible, we will focus on the use of rule-based classification techniques. More specifically, we will induce rule-sets from a churn dataset using AntMiner+ and ALBA, as well as with more traditional rule induction techniques C4.5 and RIPPER. The workings of AntMiner+ and ALBA are explained briefly in the next two sections.
Dataset
AntMiner+ and ALBA are applied on a publicly available dataset downloaded from the KDD library.2 The dataset is obtained from a wireless telecom operator, and consists of 5000 observations. For each observation 21 features are available, with no missing values. 14.3% of the customers are indicated to churn in the coming three months. For a full description of the dataset, one may refer to Larose (2005).
Data preprocessing
Data preprocessing was conducted in the form
Conclusion
As discussed in the literature review, churn prediction models should be both accurate and comprehensible in order to improve the efficiency of retention marketing campaigns. This paper presents the application of AntMiner+ and ALBA on a publicly available churn prediction dataset. Both techniques explicitly seek to induce accurate as well as comprehensible rule-sets. The results are benchmarked to C4.5, RIPPER, SVM, and logistic regression. It is shown that ALBA, combined with RIPPER or C4.5,
Acknowledgements
We extend our gratitude to the Flemish Research Council for financial support (FWO postdoctoral research grant, Odysseus Grant B.0915.09), and the National Bank of Belgium (NBB/10/006).
References (68)
Customer satisfaction cues to support market segmentation and explain switching behavior
Journal of Business Research
(2000)Beam-ACO – hybridizing ant colony optimization with beam search: An application to open shop scheduling
Computers & Operations Research
(2005)Finding groups in data: Cluster analysis with ants
Applied Soft Computing
(2009)- et al.
Customer base analysis: Partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting
European Journal Of Operational Research
(2005) - et al.
Handling class imbalance in customer churn prediction
Expert Systems with Applications
(2009) Fast effective rule induction
- et al.
Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques
Expert Systems with Applications
(2008) - et al.
Inferring rules for business/ict alignment using ants
Information and Management
(2009) - et al.
Path planning for autonomous mobile robot navigation with ant colony optimization and fuzzy cost function evaluation
Applied Soft Computing
(2009) - et al.
A modified pareto/NBD approach for predicting customer lifetime value
Expert Systems with Applications
(2009)