skip to main content
10.1145/3167918.3167930acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesaus-cswConference Proceedingsconference-collections
research-article

A systematic map of data analytics in breast cancer

Published:29 January 2018Publication History

ABSTRACT

Data mining or Data Analytics is a set of techniques that allows to analyzing data from different perspectives and summarizing it into useful information. It is the process of finding correlations or patterns in large historical datasets. It can be applied in almost any field ranging from business to education, then to medicine. Data mining has been increasingly used in medicine, especially in oncology. Breast cancer (BC) becomes the most common cancer among females worldwide and the leading cause of death in developed countries. Many studies have attempted to apply Data mining techniques to detect survivability of cancers in human beings. This paper aims to perform a systematic mapping study to analyze and synthesize studies on the application of Data mining techniques in breast cancer. 403 articles published between 2000 and 2016 were therefore selected and analyzed according to five criteria: year and channel of publication, research type, medical task, empirical type and DM techniques. Results show that conferences and journals are the most publication venues, researchers were more interested in applying DM techniques for diagnosis of BC, historical based evaluation was the empirical type of studies most used in the evaluation of DM techniques in BC, and classification was the most investigated task of DM in BC.

References

  1. Breast cancer (female) - Treatment - NHS Choices. http://www.nhs.uk/Conditions/Cancer-of-the-breastGoogle ScholarGoogle Scholar
  2. Breast Cancer Risk Factors. http://www.breastcancer.org/risk/factorsGoogle ScholarGoogle Scholar
  3. Breast Cancer Wisconsin. https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsinGoogle ScholarGoogle Scholar
  4. PEIPA, the Pilot European Image Processing Archive. http://peipa.essex.ac.uk/pix/mias/all-mias.tar.gzGoogle ScholarGoogle Scholar
  5. SEER Data, 1973-2014. https://seer.cancer.gov/data/Google ScholarGoogle Scholar
  6. USF Digital Mammography Home Page. http://marathon.csee.usf.edu/Mammography/Database.htmlGoogle ScholarGoogle Scholar
  7. World Health Organization. http://www.who.intGoogle ScholarGoogle Scholar
  8. Pedro Henriques Abreu, Miriam Seoane Santos, Miguel Henriques Abreu, Bruno Andrade, and Daniel Castro Silva. 2016. Predicting breast cancer recurrence using machine learning techniques: A systematic review. ACM Computing Surveys (CSUR) 49, 3 (2016), 52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ebrahim Edriss Ebrahim Ali and Wu Zhi Feng. Breast Cancer Classification using Support Vector Machine and Neural Network. International Journal of Science and Research (IJSR) ISSN (Online) (????), 2319--7064.Google ScholarGoogle Scholar
  10. Samaneh Aminikhanghahi, Sung Shin, Wei Wang, Seong H Son, and Soon I Jeon. 2014. An optimized support vector machine classifier to extract abnormal features from breast microwave tomography data. In Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems. ACM, 111--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mohammad Ashraf, Kim Le, and Xu Huang. 2011. Iterative weighted k-nn for constructing missing feature values in wisconsin breast cancer dataset. In Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on. IEEE, 23--27.Google ScholarGoogle Scholar
  12. Ritika Bewal, Aneecia Ghosh, and Apoorva Chaudhary. 2015. Detection of Breast Cancer using Neural Networks âĂŞ A Review. J Clin Biomed Sci 5, 4 (2015), 143--148.Google ScholarGoogle Scholar
  13. Catherine L Blake and Christopher J Merz. 1998. UCI Repository of machine learning databases {http://www.ics.uci.edu/~mlearn/MLRepository.html}. Irvine, CA: University of California. Department of Information and Computer Science 55 (1998).Google ScholarGoogle Scholar
  14. Keir Bovis and Sameer Singh. 2000. Detection of masses in mammograms using texture features. In Pattern Recognition, 2000. Proceedings. 15th International Conference on, Vol. 2. IEEE, 267--270.Google ScholarGoogle ScholarCross RefCross Ref
  15. Pearl Brereton, Barbara A Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from applying the systematic literature review process within the software engineering domain. Journal of systems and software 80, 4 (2007), 571--583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Abdülkadir Çakır and Burçin Demirel. 2011. A software tool for determination of breast cancer treatment methods using data mining approach. Journal of medical systems 35, 6 (2011), 1503--1511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jong Pill Choi, Tae Hwa Han, and Rae Woong Park. 2009. A hybrid bayesian network model for predicting breast cancer prognosis. Journal of Korean Society of Medical Informatics 15, 1 (2009), 49--57.Google ScholarGoogle ScholarCross RefCross Ref
  18. Nelly Condori-Fernandez, Maya Daneva, Klaas Sikkel, Roel Wieringa, Oscar Dieste, and Oscar Pastor. 2009. A systematic mapping study on empirical evaluation of software requirements specifications techniques. In Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE Computer Society, 502--505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Peter Croft, Douglas G Altman, Jonathan J Deeks, Kate M Dunn, Alastair D Hay, Harry Hemingway, Linda LeResche, George Peat, Pablo Perel, Steffen E Petersen, et al. 2015. The science of clinical practice: disease diagnosis or patient prognosis? Evidence about âĂIJ what is likely to happenâĂİ should shape clinical practice. BMC medicine 13, 1 (2015), 20.Google ScholarGoogle Scholar
  20. Nura Esfandiari, Mohammad Reza Babavalian, Amir-Masoud Eftekhari Moghadam, and Vahid Kashani Tabar. 2014. Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications 41, 9 (2014), 4434--4463. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Qi Fan, Chang-Jie Zhu, Jian-Yu Xiao, Bao-Hua Wang, Liu Yin, Xiao-Lu Xu, and Feng Rong. 2010. An application of apriori algorithm in SEER breast cancer data. In Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on, Vol. 3. IEEE, 114--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Qi Fan, Chang-jie Zhu, and Liu Yin. 2010. Predicting breast cancer recurrence using data mining techniques. In Bioinformatics and Biomedical Technology (ICBBT), 2010 International Conference on. IEEE, 310--311.Google ScholarGoogle ScholarCross RefCross Ref
  23. Shelly Gupta, Dharminder Kumar, and Anand Sharma. 2011. Data mining classification techniques applied for breast cancer diagnosis and prognosis. Indian Journal of Computer Science and Engineering (IJCSE) 2, 2 (2011), 188--195.Google ScholarGoogle Scholar
  24. Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data mining: concepts and techniques. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yang Hu, Jie Li, and Zhicheng Jiao. 2016. Mammographic Mass Detection Based on Saliency with Deep Features. In Proceedings of the International Conference on Internet Multimedia Computing and Service. ACM, 292--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tarun Jayaraj, VG Sanjana, V Priya Darshini, et al. 2016. A review on neural network and its implementation on breast cancer detection. In Communication and Signal Processing (ICCSP), 2016 International Conference on. IEEE, 1727--1730.Google ScholarGoogle Scholar
  27. I Kadi and A Idri. 2016. Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study. Studies in health technology and informatics 223 (2016), 1--8.Google ScholarGoogle Scholar
  28. Ilham Kadi, Ali Idri, and JL Fernandez-Aleman. 2017. Knowledge discovery in cardiology: A systematic literature review. International journal of medical informatics 97 (2017), 12--32.Google ScholarGoogle Scholar
  29. Ilham Kadi, Ali Idri, and José Luis Fernandez-Aleman. 2017. Systematic mapping study of data mining-based empirical studies in cardiology. Health Informatics Journal (2017), 1460458217717636.Google ScholarGoogle Scholar
  30. Aisan Maghsoodi, Merlijn Sevenster, Johannes Scholtes, and Georgi Nalbantov. 2012. Sentence-based classification of free-text breast cancer radiology reports. In Computer-Based Medical Systems (CBMS), 2012 25th International Symposium on. IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  31. Olvi L Mangasarian, W Nick Street, and William H Wolberg. 1995. Breast cancer diagnosis and prognosis via linear programming. Operations Research 43, 4 (1995), 570--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rozita Jamili Oskouei, Nasroallah Moradi Kor, and Saeid Abbasi Maleki. 2017. Data mining and medical world: breast cancersâĂŹ diagnosis, treatment, prognosis and challenges. American journal of cancer research 7, 3 (2017), 610.Google ScholarGoogle Scholar
  33. Sofia Ouhbi, Ali Idri, José Luis Fernández-Alemán, and Ambrosio Toval. 2015. Requirements engineering education: a systematic mapping study. Requirements Engineering 20, 2 (2015), 119--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Chulwoo Pack, Sung Shin, Seong Ho Son, and Soon Ik Jeon. 2015. Computer aided breast cancer diagnosis system with fuzzy multiple-parameter support vector machine. In Proceedings of the 2015 Conference on research in adaptive and convergent systems. ACM, 172--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B Padmapriya and T Velmurugan. 2014. A survey on breast cancer analysis using data mining techniques. In Computational Intelligence and Computing Research (ICCIC), 2014 IEEE International Conference on. IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  36. Alberto Palacios Pawlovsky and Mai Nagahashi. 2014. A method to select a good setting for the kNN algorithm when using it for breast cancer prognosis. In Biomedical and Health Informatics (BHI), 2014 IEEE-EMBS International Conference on. IEEE, 189--192.Google ScholarGoogle ScholarCross RefCross Ref
  37. Kai Petersen, Robert Feldt, Shahid Mujtaba, and Michael Mattsson. 2008. Systematic Mapping Studies in Software Engineering.. In EASE, Vol. 8. 68--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology 64 (2015), 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G Rani, Dennis Gladis, and Joy Mammen. 2015. Classification and Prediction of Breast Cancer Data derived Using Natural Language Processing. In Proceedings of the Third International Symposium on Women in Computing and Informatics. ACM, 250--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rahimeh Rouhi, Mehdi Jafari, Shohreh Kasaei, and Peiman Keshavarzian. 2015. Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Systems with Applications 42, 3 (2015), 990--1002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Gouda I Salama, M Abdelhalim, and Magdy Abd-elghany Zeid. 2012. Breast cancer diagnosis on three different datasets using multi-classifiers. Breast Cancer (WDBC) 32, 569 (2012), 2.Google ScholarGoogle Scholar
  42. A Soltani Sarvestani, AA Safavi, NM Parandeh, and M Salehi. 2010. Predicting breast cancer survivability using data mining techniques. In Software technology and Engineering (ICSTE), 2010 2nd international Conference on, Vol. 2. IEEE, V2--227.Google ScholarGoogle Scholar
  43. Zehra Karapinar Senturk and Resul Kara. 2014. Breast Cancer Diagnosis via Data Mining: Performance Analysis of Seven different algorithms. Computer Science & Engineering 4, 1 (2014), 35.Google ScholarGoogle Scholar
  44. Shiv Shakti Shrivastavat, Anjali Sant, and Ramesh Prasad Aharwal. 2013. An overview on data mining approach on breast cancer data. International Journal of Advanced Computer Research 3, 4 (2013), 256.Google ScholarGoogle Scholar
  45. Nasser H Sweilam, AA Tharwat, and NK Abdel Moniem. 2010. Support vector machine for diagnosis cancer disease: A comparative study. Egyptian Informatics Journal 11, 2 (2010), 81--92.Google ScholarGoogle ScholarCross RefCross Ref
  46. Paolo Tonella, Marco Torchiano, Bart Du Bois, and Tarja Systä. 2007. Empirical studies in reverse engineering: state of the art and future trends. Empirical Software Engineering 12, 5 (2007), 551--571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Elif Derya Übeyli. 2007. Implementing automated diagnostic systems for breast cancer detection. Expert Systems with Applications 33, 4 (2007), 1054--1062. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Chuin-Mu Wang, Xiao-Xing Mai, Geng-Cheng Lin, and Chio-Tan Kuo. 2008. Classification for breast MRI using support vector machine. In Computer and Information Technology Workshops, 2008. CIT Workshops 2008. IEEE 8th International Conference on. IEEE, 362--367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. William H Wolberg, W Nick Street, and Olvi L Mangasarian. 1995. Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Analytical and Quantitative cytology and histology 17, 2 (1995), 77--87.Google ScholarGoogle Scholar
  50. Tao Zeng and Juan Liu. 2010. Mixture classification model based on clinical markers for breast cancer prognosis. Artificial Intelligence in Medicine 48, 2 (2010), 129--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Guoqiang Zhang, B Eddy Patuwo, and Michael Y Hu. 1998. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting 14, 1 (1998), 35--62.Google ScholarGoogle Scholar
  52. Gensheng Zhang, Wei Wang, Jucheol Moon, Jeong K Pack, and Soon Ik Jeon. 2011. A review of breast tissue classification in mammograms. In Proceedings of the 2011 ACM Symposium on Research in Applied Computation. ACM, 232--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Wen-Jing Zhang, Dong-Lai Ma, and Bin Dong. 2012. The automatic diagnosis system of breast cancer based on the improved Apriori algorithm. In Machine Learning and Cybernetics (ICMLC), 2012 International Conference on, Vol. 1. IEEE, 63--66.Google ScholarGoogle ScholarCross RefCross Ref
  54. Bichen Zheng, Sang Won Yoon, and Sarah S Lam. 2014. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications 41, 4 (2014), 1476--1482. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A systematic map of data analytics in breast cancer

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ACSW '18: Proceedings of the Australasian Computer Science Week Multiconference
      January 2018
      404 pages
      ISBN:9781450354363
      DOI:10.1145/3167918

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 January 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ACSW '18 Paper Acceptance Rate49of96submissions,51%Overall Acceptance Rate204of424submissions,48%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader