Skip to main content

Advertisement

Log in

Discovering Mammography-based Machine Learning Classifiers for Breast Cancer Diagnosis

  • ORIGINAL PAPER
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

This work explores the design of mammography-based machine learning classifiers (MLC) and proposes a new method to build MLC for breast cancer diagnosis. We massively evaluated MLC configurations to classify features vectors extracted from segmented regions (pathological lesion or normal tissue) on craniocaudal (CC) and/or mediolateral oblique (MLO) mammography image views, providing BI-RADS diagnosis. Previously, appropriate combinations of image processing and normalization techniques were applied to reduce image artifacts and increase mammograms details. The method can be used under different data acquisition circumstances and exploits computer clusters to select well performing MLC configurations. We evaluated 286 cases extracted from the repository owned by HSJ-FMUP, where specialized radiologists segmented regions on CC and/or MLO images (biopsies provided the golden standard). Around 20,000 MLC configurations were evaluated, obtaining classifiers achieving an area under the ROC curve of 0.996 when combining features vectors extracted from CC and MLO views of the same case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Althuis, M. D., et al., Global trends in breast cancer incidence and mortality 1973–1997. Int. J. Epidemiol. 34:405–412, 2005. April 1, 2005.

    Article  Google Scholar 

  2. Veloso, V., “Cancro da mama mata 5 mulheres por dia em Portugal,”. In: (Ed.) CiênciaHoje. Lisboa, Portugal, 2009

  3. Tabár, L., et al., Beyond randomized controlled trials: organized mammographic screening substantially reduces breast carcinoma mortality. Cancer 91:1724–1731, 2001.

    Article  Google Scholar 

  4. Brown, J., et al., Mammography screening: an incremental cost effectiveness analysis of double versus single reading of mammograms, BMJ (Clinical research ed.) 312:809–812, 1996.

    Article  Google Scholar 

  5. Sampat, M. P., et al., Computer-Aided Detection and Diagnosis in Mammography. In: Al, B. (Ed.), Handbook of Image and Video Processing, Secondth edition. Academic, ed Burlington, pp. 1195–1217, 2005.

    Chapter  Google Scholar 

  6. López, Y., et al., “Breast cancer diagnosis based on a suitable combination of deformable models and artificial neural networks techniques,”. In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 4756/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 803–811.

  7. López, Y., et al., “Computer aided diagnosis system to detect breast cancer pathological lesions,” In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 5197/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 453–460.

  8. Ramos-Pollan, R. et al., “Exploiting eInfrastructures for medical image storage and analysis: A grid application for mammography CAD,”. In: The Seventh IASTED International Conference on Biomedical Engineering, Innsbruck, Austria, 2010

  9. Ramos-Pollan, R., et al., "Grid-based architecture to host multiple repositories: A mammography image analysis use case,". In: 3rd Iberian Grid Infrastructure Conference Proceedings, Valencia, Spain, 2009, pp. 327–338

  10. Ramos-Pollan, R., et al., “Building medical image repositories and CAD systems on grid infrastructures: A mammograms case,”. In: 15th edition of the Portuguese Conference on Pattern Recognition., University of Aveiro. Aveiro, Portugal, 2009.

  11. Ramos-Pollan, R., et al., “Grid computing for breast cancer CAD. A pilot experience in a medical environment,”. In: 4th Iberian Grid Infrastructure Conference, Minho, Portugal, 2010, pp. 307–318.

  12. NEMA. (2010), Digital Imaging and Communications in Medicine. Available: http://dicom.nema.org/

  13. Espert, I. B., et al., Content-based organisation of virtual repositories of DICOM objects. Future Gener Comput. Syst. 25:627–637, 2009.

    Article  Google Scholar 

  14. D’Orsi, C. J., et al., Breast imaging reporting and data system: ACR BI-RADS-mammography, 4th Edition ed.: American College of Radiology, 2003.

  15. Chenyang, X., and Prince, J. L., Snakes, shapes, and gradient vector flow. Image Process. IEEE Trans. 7:359–369, 1998.

    Article  MATH  Google Scholar 

  16. Liang, J., et al., United snakes. Med. Image Anal. 10:215–233, 2006.

    Article  Google Scholar 

  17. Rodenacker, K., A feature set for cytometry on digitized microscopic images. Cell Pathol 25:1–36, 2001.

    Google Scholar 

  18. Haralick, R., et al., Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3:610–621, 1973.

    Article  MathSciNet  Google Scholar 

  19. Oliver, A., et al., A review of automatic mass detection and segmentation in mammographic images. Med. Image Anal. 14:87–110, 2010.

    Article  Google Scholar 

  20. Mark Hall, et al., “The WEKA data mining software: an update,” SIGKDD Explorations, vol. 11, 2009.

  21. Park, S. C., et al., Improving performance of computer-aided detection scheme by combining results from two machine learning classifiers. Acad. Radiol. 16:266–274, 2009.

    Article  Google Scholar 

  22. Verma, B., et al., Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37:3344–3351, 2010.

    Article  Google Scholar 

  23. Mavroforakis, M. E., et al., Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif. Intell. Med. 37:145–162, 2006.

    Article  Google Scholar 

  24. Mavroforakis, M., et al., Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. Eur. J. Radiol. 54:80–89, 2005.

    Article  Google Scholar 

  25. Butler, S. M., et al., A case study in feature invention for breast cancer diagnosis using X-ray scatter images. In: Gedeon, T. D., and Fung, L. C. C. (Eds.), AI 2003: Advances in Artificial Intelligence. vol. 2903. Springer, Berlin/Heidelberg, pp. 677–685, 2003.

    Chapter  Google Scholar 

  26. Song, J. H., et al., Comparative analysis of logistic regression and artificial neural network for computer-aided diagnosis of breast masses. Acad. Radiol. 12:487–495, 2005.

    Article  Google Scholar 

  27. Abonyi, J., and Szeifert, F., Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit. Lett. 24:2195–2207, 2003.

    Article  MATH  Google Scholar 

  28. Setiono, R., Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med. 18:205–219, 2000.

    Article  Google Scholar 

  29. Fan, C.-Y., et al., A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl. Soft Comput. 11:632–644, 2011.

    Article  Google Scholar 

  30. Sweilam, N. H., et al., Support vector machine for diagnosis cancer disease: A comparative study. Egypt. Inform. J. 11:81–92, 2010.

    Article  Google Scholar 

  31. Bishop, C. M., Neural Networks for Pattern Recognition: Oxford University Press, Inc., 1995.

  32. Heaton, J., “Programming Neural Networks with Encog 2 in Java,” ed: Heaton Research, Inc., 2010.

  33. Chang, C-C., and LinC.-J., (2001, LIBSVM: a library for support vector machines. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm

  34. Foster, I, and Kesselman, C., The Grid 2, Second Edition: Blueprint for a New Computing Infrastructure, 2nd ed.: Elsevier, 2004.

  35. The gLite middleware. Available: http://glite.web.cern.ch

  36. Ramos Pollan, R., et al., “Introducing ROC curves as error measure functions. A new approach to train ANN-based biomedical data classifiers,”. In: 15th Iberoamerican Congress on Pattern Recognition, Sao Paolo, Brasil, 2010.

  37. Yoon, H. J., et al., Evaluating computer-aided detection algorithms. Med. Phys. 34:2024–2038, 2007.

    Article  Google Scholar 

  38. Fawcett, T., An introduction to ROC analysis. Pattern Recognit. Lett. 27:861–874, 2006.

    Article  Google Scholar 

  39. John Eng, M. D., (2006, March 7). ROC analysis: Web-based calculator for ROC curves. Available: http://www.jrocfit.org

  40. Kim, J.-H., Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput. Stat. Data Anal. 53:3735–3745, 2009.

    Article  MATH  Google Scholar 

  41. Efron, B., and Gong, G., A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation. Am. Stat. 37:36–48, 1983.

    MathSciNet  Google Scholar 

  42. Efron, B., Estimating the error rate of a prediction rule: Improvement on cross-validation. J. Am. Stat. Assoc. 78:316–331, 1983.

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is part of the GRIDMED research collaboration project between INEGI (Portugal) and CETA-CIEMAT (Spain). Prof. Guevara acknowledges POPH - QREN-Tipologia 4.2 – Promotion of scientific employment funded by the ESF and MCTES, Portugal. CETA-CIEMAT acknowledges the support of the European Regional Development Fund

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raúl Ramos-Pollán.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramos-Pollán, R., Guevara-López, M.A., Suárez-Ortega, C. et al. Discovering Mammography-based Machine Learning Classifiers for Breast Cancer Diagnosis. J Med Syst 36, 2259–2269 (2012). https://doi.org/10.1007/s10916-011-9693-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10916-011-9693-2

Keywords

Navigation