Abstract
This work explores the design of mammography-based machine learning classifiers (MLC) and proposes a new method to build MLC for breast cancer diagnosis. We massively evaluated MLC configurations to classify features vectors extracted from segmented regions (pathological lesion or normal tissue) on craniocaudal (CC) and/or mediolateral oblique (MLO) mammography image views, providing BI-RADS diagnosis. Previously, appropriate combinations of image processing and normalization techniques were applied to reduce image artifacts and increase mammograms details. The method can be used under different data acquisition circumstances and exploits computer clusters to select well performing MLC configurations. We evaluated 286 cases extracted from the repository owned by HSJ-FMUP, where specialized radiologists segmented regions on CC and/or MLO images (biopsies provided the golden standard). Around 20,000 MLC configurations were evaluated, obtaining classifiers achieving an area under the ROC curve of 0.996 when combining features vectors extracted from CC and MLO views of the same case.
Similar content being viewed by others
References
Althuis, M. D., et al., Global trends in breast cancer incidence and mortality 1973–1997. Int. J. Epidemiol. 34:405–412, 2005. April 1, 2005.
Veloso, V., “Cancro da mama mata 5 mulheres por dia em Portugal,”. In: (Ed.) CiênciaHoje. Lisboa, Portugal, 2009
Tabár, L., et al., Beyond randomized controlled trials: organized mammographic screening substantially reduces breast carcinoma mortality. Cancer 91:1724–1731, 2001.
Brown, J., et al., Mammography screening: an incremental cost effectiveness analysis of double versus single reading of mammograms, BMJ (Clinical research ed.) 312:809–812, 1996.
Sampat, M. P., et al., Computer-Aided Detection and Diagnosis in Mammography. In: Al, B. (Ed.), Handbook of Image and Video Processing, Secondth edition. Academic, ed Burlington, pp. 1195–1217, 2005.
López, Y., et al., “Breast cancer diagnosis based on a suitable combination of deformable models and artificial neural networks techniques,”. In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 4756/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 803–811.
López, Y., et al., “Computer aided diagnosis system to detect breast cancer pathological lesions,” In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 5197/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 453–460.
Ramos-Pollan, R. et al., “Exploiting eInfrastructures for medical image storage and analysis: A grid application for mammography CAD,”. In: The Seventh IASTED International Conference on Biomedical Engineering, Innsbruck, Austria, 2010
Ramos-Pollan, R., et al., "Grid-based architecture to host multiple repositories: A mammography image analysis use case,". In: 3rd Iberian Grid Infrastructure Conference Proceedings, Valencia, Spain, 2009, pp. 327–338
Ramos-Pollan, R., et al., “Building medical image repositories and CAD systems on grid infrastructures: A mammograms case,”. In: 15th edition of the Portuguese Conference on Pattern Recognition., University of Aveiro. Aveiro, Portugal, 2009.
Ramos-Pollan, R., et al., “Grid computing for breast cancer CAD. A pilot experience in a medical environment,”. In: 4th Iberian Grid Infrastructure Conference, Minho, Portugal, 2010, pp. 307–318.
NEMA. (2010), Digital Imaging and Communications in Medicine. Available: http://dicom.nema.org/
Espert, I. B., et al., Content-based organisation of virtual repositories of DICOM objects. Future Gener Comput. Syst. 25:627–637, 2009.
D’Orsi, C. J., et al., Breast imaging reporting and data system: ACR BI-RADS-mammography, 4th Edition ed.: American College of Radiology, 2003.
Chenyang, X., and Prince, J. L., Snakes, shapes, and gradient vector flow. Image Process. IEEE Trans. 7:359–369, 1998.
Liang, J., et al., United snakes. Med. Image Anal. 10:215–233, 2006.
Rodenacker, K., A feature set for cytometry on digitized microscopic images. Cell Pathol 25:1–36, 2001.
Haralick, R., et al., Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3:610–621, 1973.
Oliver, A., et al., A review of automatic mass detection and segmentation in mammographic images. Med. Image Anal. 14:87–110, 2010.
Mark Hall, et al., “The WEKA data mining software: an update,” SIGKDD Explorations, vol. 11, 2009.
Park, S. C., et al., Improving performance of computer-aided detection scheme by combining results from two machine learning classifiers. Acad. Radiol. 16:266–274, 2009.
Verma, B., et al., Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37:3344–3351, 2010.
Mavroforakis, M. E., et al., Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif. Intell. Med. 37:145–162, 2006.
Mavroforakis, M., et al., Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. Eur. J. Radiol. 54:80–89, 2005.
Butler, S. M., et al., A case study in feature invention for breast cancer diagnosis using X-ray scatter images. In: Gedeon, T. D., and Fung, L. C. C. (Eds.), AI 2003: Advances in Artificial Intelligence. vol. 2903. Springer, Berlin/Heidelberg, pp. 677–685, 2003.
Song, J. H., et al., Comparative analysis of logistic regression and artificial neural network for computer-aided diagnosis of breast masses. Acad. Radiol. 12:487–495, 2005.
Abonyi, J., and Szeifert, F., Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit. Lett. 24:2195–2207, 2003.
Setiono, R., Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med. 18:205–219, 2000.
Fan, C.-Y., et al., A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl. Soft Comput. 11:632–644, 2011.
Sweilam, N. H., et al., Support vector machine for diagnosis cancer disease: A comparative study. Egypt. Inform. J. 11:81–92, 2010.
Bishop, C. M., Neural Networks for Pattern Recognition: Oxford University Press, Inc., 1995.
Heaton, J., “Programming Neural Networks with Encog 2 in Java,” ed: Heaton Research, Inc., 2010.
Chang, C-C., and LinC.-J., (2001, LIBSVM: a library for support vector machines. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Foster, I, and Kesselman, C., The Grid 2, Second Edition: Blueprint for a New Computing Infrastructure, 2nd ed.: Elsevier, 2004.
The gLite middleware. Available: http://glite.web.cern.ch
Ramos Pollan, R., et al., “Introducing ROC curves as error measure functions. A new approach to train ANN-based biomedical data classifiers,”. In: 15th Iberoamerican Congress on Pattern Recognition, Sao Paolo, Brasil, 2010.
Yoon, H. J., et al., Evaluating computer-aided detection algorithms. Med. Phys. 34:2024–2038, 2007.
Fawcett, T., An introduction to ROC analysis. Pattern Recognit. Lett. 27:861–874, 2006.
John Eng, M. D., (2006, March 7). ROC analysis: Web-based calculator for ROC curves. Available: http://www.jrocfit.org
Kim, J.-H., Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput. Stat. Data Anal. 53:3735–3745, 2009.
Efron, B., and Gong, G., A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation. Am. Stat. 37:36–48, 1983.
Efron, B., Estimating the error rate of a prediction rule: Improvement on cross-validation. J. Am. Stat. Assoc. 78:316–331, 1983.
Acknowledgements
This work is part of the GRIDMED research collaboration project between INEGI (Portugal) and CETA-CIEMAT (Spain). Prof. Guevara acknowledges POPH - QREN-Tipologia 4.2 – Promotion of scientific employment funded by the ESF and MCTES, Portugal. CETA-CIEMAT acknowledges the support of the European Regional Development Fund
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramos-Pollán, R., Guevara-López, M.A., Suárez-Ortega, C. et al. Discovering Mammography-based Machine Learning Classifiers for Breast Cancer Diagnosis. J Med Syst 36, 2259–2269 (2012). https://doi.org/10.1007/s10916-011-9693-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10916-011-9693-2