ABSTRACT
Data mining or Data Analytics is a set of techniques that allows to analyzing data from different perspectives and summarizing it into useful information. It is the process of finding correlations or patterns in large historical datasets. It can be applied in almost any field ranging from business to education, then to medicine. Data mining has been increasingly used in medicine, especially in oncology. Breast cancer (BC) becomes the most common cancer among females worldwide and the leading cause of death in developed countries. Many studies have attempted to apply Data mining techniques to detect survivability of cancers in human beings. This paper aims to perform a systematic mapping study to analyze and synthesize studies on the application of Data mining techniques in breast cancer. 403 articles published between 2000 and 2016 were therefore selected and analyzed according to five criteria: year and channel of publication, research type, medical task, empirical type and DM techniques. Results show that conferences and journals are the most publication venues, researchers were more interested in applying DM techniques for diagnosis of BC, historical based evaluation was the empirical type of studies most used in the evaluation of DM techniques in BC, and classification was the most investigated task of DM in BC.
- Breast cancer (female) - Treatment - NHS Choices. http://www.nhs.uk/Conditions/Cancer-of-the-breastGoogle Scholar
- Breast Cancer Risk Factors. http://www.breastcancer.org/risk/factorsGoogle Scholar
- Breast Cancer Wisconsin. https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsinGoogle Scholar
- PEIPA, the Pilot European Image Processing Archive. http://peipa.essex.ac.uk/pix/mias/all-mias.tar.gzGoogle Scholar
- SEER Data, 1973-2014. https://seer.cancer.gov/data/Google Scholar
- USF Digital Mammography Home Page. http://marathon.csee.usf.edu/Mammography/Database.htmlGoogle Scholar
- World Health Organization. http://www.who.intGoogle Scholar
- Pedro Henriques Abreu, Miriam Seoane Santos, Miguel Henriques Abreu, Bruno Andrade, and Daniel Castro Silva. 2016. Predicting breast cancer recurrence using machine learning techniques: A systematic review. ACM Computing Surveys (CSUR) 49, 3 (2016), 52. Google ScholarDigital Library
- Ebrahim Edriss Ebrahim Ali and Wu Zhi Feng. Breast Cancer Classification using Support Vector Machine and Neural Network. International Journal of Science and Research (IJSR) ISSN (Online) (????), 2319--7064.Google Scholar
- Samaneh Aminikhanghahi, Sung Shin, Wei Wang, Seong H Son, and Soon I Jeon. 2014. An optimized support vector machine classifier to extract abnormal features from breast microwave tomography data. In Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems. ACM, 111--115. Google ScholarDigital Library
- Mohammad Ashraf, Kim Le, and Xu Huang. 2011. Iterative weighted k-nn for constructing missing feature values in wisconsin breast cancer dataset. In Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on. IEEE, 23--27.Google Scholar
- Ritika Bewal, Aneecia Ghosh, and Apoorva Chaudhary. 2015. Detection of Breast Cancer using Neural Networks âĂŞ A Review. J Clin Biomed Sci 5, 4 (2015), 143--148.Google Scholar
- Catherine L Blake and Christopher J Merz. 1998. UCI Repository of machine learning databases {http://www.ics.uci.edu/~mlearn/MLRepository.html}. Irvine, CA: University of California. Department of Information and Computer Science 55 (1998).Google Scholar
- Keir Bovis and Sameer Singh. 2000. Detection of masses in mammograms using texture features. In Pattern Recognition, 2000. Proceedings. 15th International Conference on, Vol. 2. IEEE, 267--270.Google ScholarCross Ref
- Pearl Brereton, Barbara A Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from applying the systematic literature review process within the software engineering domain. Journal of systems and software 80, 4 (2007), 571--583. Google ScholarDigital Library
- Abdülkadir Çakır and Burçin Demirel. 2011. A software tool for determination of breast cancer treatment methods using data mining approach. Journal of medical systems 35, 6 (2011), 1503--1511. Google ScholarDigital Library
- Jong Pill Choi, Tae Hwa Han, and Rae Woong Park. 2009. A hybrid bayesian network model for predicting breast cancer prognosis. Journal of Korean Society of Medical Informatics 15, 1 (2009), 49--57.Google ScholarCross Ref
- Nelly Condori-Fernandez, Maya Daneva, Klaas Sikkel, Roel Wieringa, Oscar Dieste, and Oscar Pastor. 2009. A systematic mapping study on empirical evaluation of software requirements specifications techniques. In Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE Computer Society, 502--505. Google ScholarDigital Library
- Peter Croft, Douglas G Altman, Jonathan J Deeks, Kate M Dunn, Alastair D Hay, Harry Hemingway, Linda LeResche, George Peat, Pablo Perel, Steffen E Petersen, et al. 2015. The science of clinical practice: disease diagnosis or patient prognosis? Evidence about âĂIJ what is likely to happenâĂİ should shape clinical practice. BMC medicine 13, 1 (2015), 20.Google Scholar
- Nura Esfandiari, Mohammad Reza Babavalian, Amir-Masoud Eftekhari Moghadam, and Vahid Kashani Tabar. 2014. Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications 41, 9 (2014), 4434--4463. Google ScholarDigital Library
- Qi Fan, Chang-Jie Zhu, Jian-Yu Xiao, Bao-Hua Wang, Liu Yin, Xiao-Lu Xu, and Feng Rong. 2010. An application of apriori algorithm in SEER breast cancer data. In Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on, Vol. 3. IEEE, 114--116. Google ScholarDigital Library
- Qi Fan, Chang-jie Zhu, and Liu Yin. 2010. Predicting breast cancer recurrence using data mining techniques. In Bioinformatics and Biomedical Technology (ICBBT), 2010 International Conference on. IEEE, 310--311.Google ScholarCross Ref
- Shelly Gupta, Dharminder Kumar, and Anand Sharma. 2011. Data mining classification techniques applied for breast cancer diagnosis and prognosis. Indian Journal of Computer Science and Engineering (IJCSE) 2, 2 (2011), 188--195.Google Scholar
- Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data mining: concepts and techniques. Elsevier. Google ScholarDigital Library
- Yang Hu, Jie Li, and Zhicheng Jiao. 2016. Mammographic Mass Detection Based on Saliency with Deep Features. In Proceedings of the International Conference on Internet Multimedia Computing and Service. ACM, 292--297. Google ScholarDigital Library
- Tarun Jayaraj, VG Sanjana, V Priya Darshini, et al. 2016. A review on neural network and its implementation on breast cancer detection. In Communication and Signal Processing (ICCSP), 2016 International Conference on. IEEE, 1727--1730.Google Scholar
- I Kadi and A Idri. 2016. Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study. Studies in health technology and informatics 223 (2016), 1--8.Google Scholar
- Ilham Kadi, Ali Idri, and JL Fernandez-Aleman. 2017. Knowledge discovery in cardiology: A systematic literature review. International journal of medical informatics 97 (2017), 12--32.Google Scholar
- Ilham Kadi, Ali Idri, and José Luis Fernandez-Aleman. 2017. Systematic mapping study of data mining-based empirical studies in cardiology. Health Informatics Journal (2017), 1460458217717636.Google Scholar
- Aisan Maghsoodi, Merlijn Sevenster, Johannes Scholtes, and Georgi Nalbantov. 2012. Sentence-based classification of free-text breast cancer radiology reports. In Computer-Based Medical Systems (CBMS), 2012 25th International Symposium on. IEEE, 1--4.Google ScholarCross Ref
- Olvi L Mangasarian, W Nick Street, and William H Wolberg. 1995. Breast cancer diagnosis and prognosis via linear programming. Operations Research 43, 4 (1995), 570--577. Google ScholarDigital Library
- Rozita Jamili Oskouei, Nasroallah Moradi Kor, and Saeid Abbasi Maleki. 2017. Data mining and medical world: breast cancersâĂŹ diagnosis, treatment, prognosis and challenges. American journal of cancer research 7, 3 (2017), 610.Google Scholar
- Sofia Ouhbi, Ali Idri, José Luis Fernández-Alemán, and Ambrosio Toval. 2015. Requirements engineering education: a systematic mapping study. Requirements Engineering 20, 2 (2015), 119--138. Google ScholarDigital Library
- Chulwoo Pack, Sung Shin, Seong Ho Son, and Soon Ik Jeon. 2015. Computer aided breast cancer diagnosis system with fuzzy multiple-parameter support vector machine. In Proceedings of the 2015 Conference on research in adaptive and convergent systems. ACM, 172--176. Google ScholarDigital Library
- B Padmapriya and T Velmurugan. 2014. A survey on breast cancer analysis using data mining techniques. In Computational Intelligence and Computing Research (ICCIC), 2014 IEEE International Conference on. IEEE, 1--4.Google ScholarCross Ref
- Alberto Palacios Pawlovsky and Mai Nagahashi. 2014. A method to select a good setting for the kNN algorithm when using it for breast cancer prognosis. In Biomedical and Health Informatics (BHI), 2014 IEEE-EMBS International Conference on. IEEE, 189--192.Google ScholarCross Ref
- Kai Petersen, Robert Feldt, Shahid Mujtaba, and Michael Mattsson. 2008. Systematic Mapping Studies in Software Engineering.. In EASE, Vol. 8. 68--77. Google ScholarDigital Library
- Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology 64 (2015), 1--18. Google ScholarDigital Library
- G Rani, Dennis Gladis, and Joy Mammen. 2015. Classification and Prediction of Breast Cancer Data derived Using Natural Language Processing. In Proceedings of the Third International Symposium on Women in Computing and Informatics. ACM, 250--255. Google ScholarDigital Library
- Rahimeh Rouhi, Mehdi Jafari, Shohreh Kasaei, and Peiman Keshavarzian. 2015. Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Systems with Applications 42, 3 (2015), 990--1002. Google ScholarDigital Library
- Gouda I Salama, M Abdelhalim, and Magdy Abd-elghany Zeid. 2012. Breast cancer diagnosis on three different datasets using multi-classifiers. Breast Cancer (WDBC) 32, 569 (2012), 2.Google Scholar
- A Soltani Sarvestani, AA Safavi, NM Parandeh, and M Salehi. 2010. Predicting breast cancer survivability using data mining techniques. In Software technology and Engineering (ICSTE), 2010 2nd international Conference on, Vol. 2. IEEE, V2--227.Google Scholar
- Zehra Karapinar Senturk and Resul Kara. 2014. Breast Cancer Diagnosis via Data Mining: Performance Analysis of Seven different algorithms. Computer Science & Engineering 4, 1 (2014), 35.Google Scholar
- Shiv Shakti Shrivastavat, Anjali Sant, and Ramesh Prasad Aharwal. 2013. An overview on data mining approach on breast cancer data. International Journal of Advanced Computer Research 3, 4 (2013), 256.Google Scholar
- Nasser H Sweilam, AA Tharwat, and NK Abdel Moniem. 2010. Support vector machine for diagnosis cancer disease: A comparative study. Egyptian Informatics Journal 11, 2 (2010), 81--92.Google ScholarCross Ref
- Paolo Tonella, Marco Torchiano, Bart Du Bois, and Tarja Systä. 2007. Empirical studies in reverse engineering: state of the art and future trends. Empirical Software Engineering 12, 5 (2007), 551--571. Google ScholarDigital Library
- Elif Derya Übeyli. 2007. Implementing automated diagnostic systems for breast cancer detection. Expert Systems with Applications 33, 4 (2007), 1054--1062. Google ScholarDigital Library
- Chuin-Mu Wang, Xiao-Xing Mai, Geng-Cheng Lin, and Chio-Tan Kuo. 2008. Classification for breast MRI using support vector machine. In Computer and Information Technology Workshops, 2008. CIT Workshops 2008. IEEE 8th International Conference on. IEEE, 362--367. Google ScholarDigital Library
- William H Wolberg, W Nick Street, and Olvi L Mangasarian. 1995. Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Analytical and Quantitative cytology and histology 17, 2 (1995), 77--87.Google Scholar
- Tao Zeng and Juan Liu. 2010. Mixture classification model based on clinical markers for breast cancer prognosis. Artificial Intelligence in Medicine 48, 2 (2010), 129--137. Google ScholarDigital Library
- Guoqiang Zhang, B Eddy Patuwo, and Michael Y Hu. 1998. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting 14, 1 (1998), 35--62.Google Scholar
- Gensheng Zhang, Wei Wang, Jucheol Moon, Jeong K Pack, and Soon Ik Jeon. 2011. A review of breast tissue classification in mammograms. In Proceedings of the 2011 ACM Symposium on Research in Applied Computation. ACM, 232--237. Google ScholarDigital Library
- Wen-Jing Zhang, Dong-Lai Ma, and Bin Dong. 2012. The automatic diagnosis system of breast cancer based on the improved Apriori algorithm. In Machine Learning and Cybernetics (ICMLC), 2012 International Conference on, Vol. 1. IEEE, 63--66.Google ScholarCross Ref
- Bichen Zheng, Sang Won Yoon, and Sarah S Lam. 2014. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications 41, 4 (2014), 1476--1482. Google ScholarDigital Library
Index Terms
- A systematic map of data analytics in breast cancer
Recommendations
Analysis of framelets for breast cancer diagnosis
Breast cancer is the second threatening tumor among the women. The effective way of reducing breast cancer is its early detection which helps to improve the diagnosing process. Digital mammography plays a significant role in mammogram screening at ...
Computer-aided detection of breast cancer on mammograms
PSOWNN - Particle Swarm Optimized Wavelet Neural Network. DB - Database.Display Omitted We propose a CAD system for detecting breast cancer in mammograms.Swarm intelligence optimized wavelet neural network detects the cancers.We focus on optimized ...
Computer-aided detection and diagnosis of breast cancer with mammography: recent advances
Breast cancer is the second-most common and leading cause of cancer death among women. It has become a major health issue in the world over the past 50 years, and its incidence has increased in recent years. Early detection is an effective way to ...
Comments