Skip to main content
Top
Published in: Neural Computing and Applications 1/2019

17-07-2018 | S.I. : Machine Learning Applications for Self-Organized Wireless Networks

Parallel and incremental credit card fraud detection model to handle concept drift and data imbalance

Authors: Akila Somasundaram, Srinivasulu Reddy

Published in: Neural Computing and Applications | Special Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Real-time fraud detection in credit card transactions is challenging due to the intrinsic properties of transaction data, namely data imbalance, noise, borderline entities and concept drift. The advent of mobile payment systems has further complicated the fraud detection process. This paper proposes a transaction window bagging (TWB) model, a parallel and incremental learning ensemble, as a solution to handle the issues in credit card transaction data. TWB model uses a parallelized bagging approach, incorporated with an incremental learning model, cost-sensitive base learner and a weighted voting-based combiner to effectively handle concept drift and data imbalance. Experiments were performed with Brazilian Bank data and University of California, San Diego (UCSD) data, and results were compared with state-of-the-art models. Comparisons on Brazilian Bank data indicates increased fraud detection levels between 18–38% and 1.3–2 times lower cost levels, which exhibits the enhanced performances of TWB. Comparisons on UCSD data indicate improved precision levels ranging between 8 and 25%, indicating the robustness of the TWB model. Future extensions of the proposed model will be on incorporating feature engineering to improve performances.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Somasundaram A, Reddy US (2017) Modelling a stable classifier for handling large scale data with noise and imbalance. In: Computational intelligence in data science (ICCIDS), pp 1–6 Somasundaram A, Reddy US (2017) Modelling a stable classifier for handling large scale data with noise and imbalance. In: Computational intelligence in data science (ICCIDS), pp 1–6
5.
go back to reference Akila S, Srinivasulu Reddy U (2016) Data imbalance: effects and solutions for classification of large and highly imbalanced data. In: Proceedings of ICRECT.16, pp 28–34 Akila S, Srinivasulu Reddy U (2016) Data imbalance: effects and solutions for classification of large and highly imbalanced data. In: Proceedings of ICRECT.16, pp 28–34
6.
go back to reference Michalski RS (1983) A theory and methodology of inductive learning, vol 20, no 2, Springer, New York, pp 83–134 Michalski RS (1983) A theory and methodology of inductive learning, vol 20, no 2, Springer, New York, pp 83–134
8.
go back to reference Webb GI, Hyde R, Cao H, Nguyen HL, Petitjean F (2016) Characterizing concept drift. Data Min Knowl Discov 30(4):964–994MathSciNetCrossRef Webb GI, Hyde R, Cao H, Nguyen HL, Petitjean F (2016) Characterizing concept drift. Data Min Knowl Discov 30(4):964–994MathSciNetCrossRef
9.
go back to reference Hoens TR, Polikar R, Chawla NV (2012) Learning from streaming data with concept drift and imbalance an overview. Prog Artif Intell 1(1):89–101CrossRef Hoens TR, Polikar R, Chawla NV (2012) Learning from streaming data with concept drift and imbalance an overview. Prog Artif Intell 1(1):89–101CrossRef
10.
go back to reference Kim J, Choi K, Kim G, Suh Y (2012) Classification cost: An empirical comparison among traditional classifier, cost-sensitive classifier, and metacost. Expert Syst Appl 39(4):4013–4019CrossRef Kim J, Choi K, Kim G, Suh Y (2012) Classification cost: An empirical comparison among traditional classifier, cost-sensitive classifier, and metacost. Expert Syst Appl 39(4):4013–4019CrossRef
11.
go back to reference Hassan D (2017) The impact of false negative cost on the performance of cost sensitive learning based on Bayes minimum risk. A case study in detecting fraudulent transactions. Int J Intell Syst Appl 9(2):18 Hassan D (2017) The impact of false negative cost on the performance of cost sensitive learning based on Bayes minimum risk. A case study in detecting fraudulent transactions. Int J Intell Syst Appl 9(2):18
13.
go back to reference Seeja KR, Zareapoor M (2014) FraudMiner: a novel credit card fraud detection model based on frequent item set mining. Sci World J 2014:1–10CrossRef Seeja KR, Zareapoor M (2014) FraudMiner: a novel credit card fraud detection model based on frequent item set mining. Sci World J 2014:1–10CrossRef
14.
go back to reference Hegazy M, Madian A, Ragaie M (2016) Enhanced fraud miner: credit card fraud detection using clustering data mining techniques. Egypt Comput Sci J ISSN 40(03):11102586 Hegazy M, Madian A, Ragaie M (2016) Enhanced fraud miner: credit card fraud detection using clustering data mining techniques. Egypt Comput Sci J ISSN 40(03):11102586
15.
go back to reference Gadi MF, Wang X, do Lago AP (2008) Credit card fraud detection with artificial immune system. In: International conference on artificial immune systems, Springer, Berlin, pp 119–131 Gadi MF, Wang X, do Lago AP (2008) Credit card fraud detection with artificial immune system. In: International conference on artificial immune systems, Springer, Berlin, pp 119–131
16.
go back to reference Halvaiee NS, Akbari MK (2014) A novel model for credit card fraud detection using artificial immune systems. Appl Soft Comput 24:40–49CrossRef Halvaiee NS, Akbari MK (2014) A novel model for credit card fraud detection using artificial immune systems. Appl Soft Comput 24:40–49CrossRef
17.
go back to reference Ghobadi F Fahimeh, Mohsen Rohani M (2016) Cost sensitive modeling of credit card fraud using neural network strategy. In: International conference of signal processing and intelligent systems (ICSPIS), IEEE, pp 1–5 Ghobadi F Fahimeh, Mohsen Rohani M (2016) Cost sensitive modeling of credit card fraud using neural network strategy. In: International conference of signal processing and intelligent systems (ICSPIS), IEEE, pp 1–5
18.
go back to reference Bahnsen AC, Correa Alejandro, Aleksandar Stojanovic A, Djamila Aouada D, Bjorn Ottersten B (2013) Cost sensitive credit card fraud detection using Bayes minimum risk. In: 12th international conference on machine learning and applications (ICMLA), vol 1, pp 333–338 Bahnsen AC, Correa Alejandro, Aleksandar Stojanovic A, Djamila Aouada D, Bjorn Ottersten B (2013) Cost sensitive credit card fraud detection using Bayes minimum risk. In: 12th international conference on machine learning and applications (ICMLA), vol 1, pp 333–338
19.
go back to reference Bahnsen AC, Correa Alejandro, Aleksandar Stojanovic A, Djamila Aouada D, Bjorn Ottersten B (2014) Improving credit card fraud detection with calibrated probabilities. In: Proceedings of the 2014 SIAM international conference on data mining, pp 677–685 Bahnsen AC, Correa Alejandro, Aleksandar Stojanovic A, Djamila Aouada D, Bjorn Ottersten B (2014) Improving credit card fraud detection with calibrated probabilities. In: Proceedings of the 2014 SIAM international conference on data mining, pp 677–685
20.
go back to reference Bahnsen AC, Correa Alejandro, Djamia Aouada D, Bjorn Ottersten B (2014) Example-dependent cost-sensitive logistic regression for credit scoring. In: 13th international conference on in machine learning and applications (ICMLA), pp 263–269 Bahnsen AC, Correa Alejandro, Djamia Aouada D, Bjorn Ottersten B (2014) Example-dependent cost-sensitive logistic regression for credit scoring. In: 13th international conference on in machine learning and applications (ICMLA), pp 263–269
21.
go back to reference Bahnsen AC, Correa Alejandro, Djamila Aouada D, Aleksandar Stojanovic A, Bjorn Ottersten B (2016) Feature engineering strategies for credit card fraud detection. Expert Syst Appl 51:134–142CrossRef Bahnsen AC, Correa Alejandro, Djamila Aouada D, Aleksandar Stojanovic A, Bjorn Ottersten B (2016) Feature engineering strategies for credit card fraud detection. Expert Syst Appl 51:134–142CrossRef
22.
go back to reference Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G (2015) Credit card fraud detection and concept-drift adaptation with delayed supervised information. In: 2015 international joint conference on Neural networks (IJCNN), pp 1–8 Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G (2015) Credit card fraud detection and concept-drift adaptation with delayed supervised information. In: 2015 international joint conference on Neural networks (IJCNN), pp 1–8
23.
go back to reference Tennant M, Stahl F, Rana O, Gomes JB (2017) Scalable real-time classification of data streams with concept drift. Future Gener Comput Syst 75:187–199CrossRef Tennant M, Stahl F, Rana O, Gomes JB (2017) Scalable real-time classification of data streams with concept drift. Future Gener Comput Syst 75:187–199CrossRef
24.
go back to reference Wozniak MK, Sieniewicz P, Cyganek B, Kasprzak A, Walkowiak K (2016) Active learning classification of drifted streaming data. Proced Comput Sci 80:1724–1733CrossRef Wozniak MK, Sieniewicz P, Cyganek B, Kasprzak A, Walkowiak K (2016) Active learning classification of drifted streaming data. Proced Comput Sci 80:1724–1733CrossRef
25.
go back to reference Brzezinski D Dariusz (2010) Mining data streams with concept drift. PhD dissertation, Masters thesis, Poznan University of Technology Brzezinski D Dariusz (2010) Mining data streams with concept drift. PhD dissertation, Masters thesis, Poznan University of Technology
26.
go back to reference Barddal JP, Gomes HM, Enembreck F, Pfahringer B (2017) A survey on feature drift adaptation: definition, benchmark, challenges and future directions. J Syst Softw 127:278–294CrossRef Barddal JP, Gomes HM, Enembreck F, Pfahringer B (2017) A survey on feature drift adaptation: definition, benchmark, challenges and future directions. J Syst Softw 127:278–294CrossRef
27.
go back to reference Iniguez J, Hansen A, Perez I, Langham C, Rivera J, Sanchez J, Acuna J (2006) On division in extreme and mean ratio and its connection to a particular re-expression of the golden quadratic equation \(x^ 2-x- 1= 0\). Nexus Netw J 8(2):93–100CrossRefMATH Iniguez J, Hansen A, Perez I, Langham C, Rivera J, Sanchez J, Acuna J (2006) On division in extreme and mean ratio and its connection to a particular re-expression of the golden quadratic equation \(x^ 2-x- 1= 0\). Nexus Netw J 8(2):93–100CrossRefMATH
28.
go back to reference Carcillo F, DalPozzolo A, Le Borgne YA, Caelen O, Mazzer Y, Bontempi G (2018) Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf Fusion 41:182–194CrossRef Carcillo F, DalPozzolo A, Le Borgne YA, Caelen O, Mazzer Y, Bontempi G (2018) Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf Fusion 41:182–194CrossRef
29.
go back to reference Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67MathSciNetCrossRefMATH Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67MathSciNetCrossRefMATH
30.
go back to reference Bauer Eric E, Ron Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1):105–139CrossRef Bauer Eric E, Ron Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1):105–139CrossRef
31.
go back to reference van Rijn JN, Holmes G, Pfahringer B, Vanschoren J (2015) Case study on bagging stable classifiers for data streams. In: BENELEARN van Rijn JN, Holmes G, Pfahringer B, Vanschoren J (2015) Case study on bagging stable classifiers for data streams. In: BENELEARN
32.
go back to reference Bayes T (1970) An essay towards solving a problem in the doctrine of chances. C. Davis, Printer R Soc London (London, U. K) 1:134–153MATH Bayes T (1970) An essay towards solving a problem in the doctrine of chances.  C. Davis, Printer R Soc London (London, U. K) 1:134–153MATH
34.
go back to reference Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: Computational intelligence and data mining, CIDM’09, pp 324–331 Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: Computational intelligence and data mining, CIDM’09, pp 324–331
35.
go back to reference Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099CrossRef Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099CrossRef
36.
go back to reference Hido S, Kashima H, Takahashi Y (2009) Roughly balanced bagging for imbalanced data. Stat Anal Data Min ASA Data Sci J 2(56):412–426MathSciNetCrossRef Hido S, Kashima H, Takahashi Y (2009) Roughly balanced bagging for imbalanced data. Stat Anal Data Min ASA Data Sci J 2(56):412–426MathSciNetCrossRef
37.
go back to reference Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRefMATH Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRefMATH
Metadata
Title
Parallel and incremental credit card fraud detection model to handle concept drift and data imbalance
Authors
Akila Somasundaram
Srinivasulu Reddy
Publication date
17-07-2018
Publisher
Springer London
Published in
Neural Computing and Applications / Issue Special Issue 1/2019
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3633-8

Other articles of this Special Issue 1/2019

Neural Computing and Applications 1/2019 Go to the issue

Machine Learning Applications for Self-Organized Wireless Networks

Type II assembly line balancing problem with multi-operators

Premium Partner