Skip to main content
Top

2015 | OriginalPaper | Chapter

Data Mining Techniques in Health Informatics: A Case Study from Breast Cancer Research

Authors : Jing Lu, Alan Hales, David Rew, Malcolm Keech, Christian Fröhlingsdorf, Alex Mills-Mullett, Christian Wette

Published in: Information Technology in Bio- and Medical Informatics

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper presents a case study of using data mining techniques in the analysis of diagnosis and treatment events related to Breast Cancer disease. Data from over 16,000 patients has been pre-processed and several data mining techniques have been implemented by using Weka (Waikato Environment for Knowledge Analysis). In particular, Generalized Sequential Patterns mining has been used to discover frequent patterns from disease event sequence profiles based on groups of living and deceased patients. Furthermore, five models have been evaluated in Classification with the objective to classify the patients based on selected attributes. This research showcases the data mining process and techniques to transform large amounts of patient data into useful information and potentially valuable patterns to help understand cancer outcomes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Burke, H.B., Rosen, D., Goodman, P.: Comparing the prediction accuracy of artificial neural networks and other statistical models for breast cancer survival. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, pp. 1063–1068. MIT Press, Cambridge (1995) Burke, H.B., Rosen, D., Goodman, P.: Comparing the prediction accuracy of artificial neural networks and other statistical models for breast cancer survival. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, pp. 1063–1068. MIT Press, Cambridge (1995)
2.
go back to reference Campbell, K., Thygeson, N.N., Srivastava, J., Speedie, S.: Exploration of Classification Techniques as a Treatment Decision Support Tool for Patients with Uterine Fibroids. In: International Workshop on Data Mining for HealthCare Management, PAKDD (2010) Campbell, K., Thygeson, N.N., Srivastava, J., Speedie, S.: Exploration of Classification Techniques as a Treatment Decision Support Tool for Patients with Uterine Fibroids. In: International Workshop on Data Mining for HealthCare Management, PAKDD (2010)
3.
go back to reference Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRef Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRef
4.
go back to reference Fayyad, U., PiatetskyShapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Magazine. 17(3), 37–54 (1996) Fayyad, U., PiatetskyShapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Magazine. 17(3), 37–54 (1996)
5.
go back to reference Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann. (2011) Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann. (2011)
6.
go back to reference Jacob, S.G., Ramani, R.G.: Data mining in clinical data sets: a review. Int. J. Appl. Inf. Syst. 4(6), 15–16 (2012) Jacob, S.G., Ramani, R.G.: Data mining in clinical data sets: a review. Int. J. Appl. Inf. Syst. 4(6), 15–16 (2012)
7.
go back to reference Jerez-Aragones, J.M., Gomez-Ruiz, J.A., Ramos-Jimenez, G., MunozPerez, J., Alba-Conejo, E.: A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif. Intell. Med. 27(1), 45–63 (2003)CrossRef Jerez-Aragones, J.M., Gomez-Ruiz, J.A., Ramos-Jimenez, G., MunozPerez, J., Alba-Conejo, E.: A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif. Intell. Med. 27(1), 45–63 (2003)CrossRef
8.
go back to reference Holzinger, A.: Trends in interactive knowledge discovery for personalized medicine: cognitive science meets machine learning. IEEE Intell. Inform. Bull. 15(1), 6–14 (2014) Holzinger, A.: Trends in interactive knowledge discovery for personalized medicine: cognitive science meets machine learning. IEEE Intell. Inform. Bull. 15(1), 6–14 (2014)
9.
go back to reference Laxminarayan, P., Alvarez, S.A., Ruiz, C., Moonis, M.: Mining statistically significant associations for exploratory analysis of human sleep data. IEEE Trans. Inf Technol. Biomed. 10(3), 440–450 (2006)CrossRef Laxminarayan, P., Alvarez, S.A., Ruiz, C., Moonis, M.: Mining statistically significant associations for exploratory analysis of human sleep data. IEEE Trans. Inf Technol. Biomed. 10(3), 440–450 (2006)CrossRef
10.
go back to reference Lee, Y.J., Mangasarian, O.L., Wolberg, W.H.: Survival-time classification of breast cancer patients. Comput. Optim. Appl. 25(1–3), 151–166 (2003)MathSciNetCrossRef Lee, Y.J., Mangasarian, O.L., Wolberg, W.H.: Survival-time classification of breast cancer patients. Comput. Optim. Appl. 25(1–3), 151–166 (2003)MathSciNetCrossRef
11.
go back to reference Li, Q., Feng, J., Wang, L., Chu, H., Yu, H.: Method for knowledge acquisition and decision-making process analysis in clinical decision support system. In: Bursa, M., Khuri, S., Renda, M. (eds.) ITBAM 2014. LNCS, vol. 8649, pp. 79–82. Springer, Heidelberg (2014) Li, Q., Feng, J., Wang, L., Chu, H., Yu, H.: Method for knowledge acquisition and decision-making process analysis in clinical decision support system. In: Bursa, M., Khuri, S., Renda, M. (eds.) ITBAM 2014. LNCS, vol. 8649, pp. 79–82. Springer, Heidelberg (2014)
12.
go back to reference Lu, J., Chen, W.R., Adjei, O., Keech, M.: Sequential patterns post-processing for structural relation patterns mining. Int. J. Data Warehousing and Mining 4(3), 71–89 (2008). IGI Global, Hershey, PennsylvaniaCrossRef Lu, J., Chen, W.R., Adjei, O., Keech, M.: Sequential patterns post-processing for structural relation patterns mining. Int. J. Data Warehousing and Mining 4(3), 71–89 (2008). IGI Global, Hershey, PennsylvaniaCrossRef
13.
go back to reference Mahajan, R., Shneiderman, B.: Visual and textual consistency checking tools for graphical user interfaces. IEEE Trans. Software Eng. 23(11), 722–735 (1997)CrossRef Mahajan, R., Shneiderman, B.: Visual and textual consistency checking tools for graphical user interfaces. IEEE Trans. Software Eng. 23(11), 722–735 (1997)CrossRef
14.
go back to reference Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–11 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–11 (2009)CrossRef
15.
go back to reference Martin, M.A., Meyricke, R., O’Neill, T., Roberts, S.: Mastectomy or breast conserving surgery? factors affecting type of surgical treatment for breast cancer: a classification tree approach. BMC Cancer 6, 98 (2006)CrossRef Martin, M.A., Meyricke, R., O’Neill, T., Roberts, S.: Mastectomy or breast conserving surgery? factors affecting type of surgical treatment for breast cancer: a classification tree approach. BMC Cancer 6, 98 (2006)CrossRef
16.
go back to reference Quinlan, J. Ross. C4.5: Programs for Machine Learning. Elsevier (2014) Quinlan, J. Ross. C4.5: Programs for Machine Learning. Elsevier (2014)
17.
go back to reference Razavi, A.R., Gill, H., Ahlfeldt, H., Shahsavar, N.: Predicting metastasis in breast cancer: comparing a decision tree with domain experts. J. Med. Syst. 31, 263–273 (2007)CrossRef Razavi, A.R., Gill, H., Ahlfeldt, H., Shahsavar, N.: Predicting metastasis in breast cancer: comparing a decision tree with domain experts. J. Med. Syst. 31, 263–273 (2007)CrossRef
18.
go back to reference Reps, J., Garibaldi, J.M., Aickelin, U., Soria, D., Gibson, J.E., Hubbard, R.B.: Discovering Sequential Patterns in a UK General Practice Database. In: IEEE-EMBS International Conference on Biomedical and Health Informatics, pp. 960–963 (2012) Reps, J., Garibaldi, J.M., Aickelin, U., Soria, D., Gibson, J.E., Hubbard, R.B.: Discovering Sequential Patterns in a UK General Practice Database. In: IEEE-EMBS International Conference on Biomedical and Health Informatics, pp. 960–963 (2012)
19.
go back to reference Rew, D.A.: Understanding outcomes in cancer surgery through time structured patient records. Indian J. Surg. Oncol. 2(4), 265–270 (2011)CrossRefMATH Rew, D.A.: Understanding outcomes in cancer surgery through time structured patient records. Indian J. Surg. Oncol. 2(4), 265–270 (2011)CrossRefMATH
20.
go back to reference Stolba, N., Tjoa, A.: The relevance of data warehousing and data mining in the field of evidence-based medicine to support healthcare decision making. Int. J. Comput. Syst. Sci. Eng. 3(3), 143–148 (2006) Stolba, N., Tjoa, A.: The relevance of data warehousing and data mining in the field of evidence-based medicine to support healthcare decision making. Int. J. Comput. Syst. Sci. Eng. 3(3), 143–148 (2006)
Metadata
Title
Data Mining Techniques in Health Informatics: A Case Study from Breast Cancer Research
Authors
Jing Lu
Alan Hales
David Rew
Malcolm Keech
Christian Fröhlingsdorf
Alex Mills-Mullett
Christian Wette
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-22741-2_6

Premium Partner