Skip to main content
Top

2017 | OriginalPaper | Chapter

Experiences Using Decision Trees for Knowledge Discovery

Authors : Eva Armengol, Àngel García-Cerdaña, Pilar Dellunde

Published in: Fuzzy Sets, Rough Sets, Multisets and Clustering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Knowledge discovery is the process of identifying useful patterns from large data sets. There are two families of approaches to be used for knowledge discovery: clustering, when the classes of domain objects are not known; and inductive learning algorithms, when the classes are known and the goal is to construct a domain model useful to identify new unseen objects. Clustering algorithms have also been proposed to analyze the data when the classes are known. However, to our knowledge, inductive learning methods are not used to analyze the available data but only for prediction. What we propose here is a methodology, namely FTree, that uses a decision tree to analyze both the available data identifying patterns and some important aspects of the domain (at least from the domain’s part represented by the data at hand) such as similarity between classes, separability, characterization of classes and even some possible errors on data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference E. Armengol. Usages of generalization in CBR. In R.O. Weber and M. M. Richter, editors, ICCBR-2007. Case-based Reasoning and Development, number 4626 in Lecture Notes in Artificial Intelligence, pages 31–45. Springer-Verlag, 2007. E. Armengol. Usages of generalization in CBR. In R.O. Weber and M. M. Richter, editors, ICCBR-2007. Case-based Reasoning and Development, number 4626 in Lecture Notes in Artificial Intelligence, pages 31–45. Springer-Verlag, 2007.
2.
go back to reference E. Armengol. Building partial domain theories from explanations. Knowledge Intelligence, 2/08:19–24, 2008. E. Armengol. Building partial domain theories from explanations. Knowledge Intelligence, 2/08:19–24, 2008.
3.
go back to reference E. Armengol and E. Plaza. Discovery of toxicological patterns with lazy learning. In V. Palade, R.J. Howlett, and L. Jain, editors, KES-2003, number 2774 in Lecture Notes in Artificial Intelligence, pages 919–926. Springer, 2003. E. Armengol and E. Plaza. Discovery of toxicological patterns with lazy learning. In V. Palade, R.J. Howlett, and L. Jain, editors, KES-2003, number 2774 in Lecture Notes in Artificial Intelligence, pages 919–926. Springer, 2003.
4.
go back to reference A. Asuncion and D.J. Newman. UCI machine learning repository, 2007. A. Asuncion and D.J. Newman. UCI machine learning repository, 2007.
5.
go back to reference L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
6.
go back to reference J. Gehrke, R. Ramakrishnan, and V. Ganti. RainForest - a framework for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery, 4(2/3):127–162, 2000. J. Gehrke, R. Ramakrishnan, and V. Ganti. RainForest - a framework for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery, 4(2/3):127–162, 2000.
7.
go back to reference L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Formulari de l’escala gencat de qualitat de vida. manual d’aplicació de l’escala gencat de qualitat de vida. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008. L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Formulari de l’escala gencat de qualitat de vida. manual d’aplicació de l’escala gencat de qualitat de vida. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008.
8.
go back to reference L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Informe sobre la creació d’una escala multidimensional per avaluar la qualitat de vida de les persones usuàries dels serveis socials a catalunya. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008. L.E. Gómez, M.A. Verdugo, B. Arias and R.L. Schalock. Informe sobre la creació d’una escala multidimensional per avaluar la qualitat de vida de les persones usuàries dels serveis socials a catalunya. Technical report, Departament d’Acció Social i Ciutadania, Generalitat de Catalunya, Barcelona, 2008.
9.
go back to reference A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Comput. Surv., 31(3):264–323, September 1999. A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Comput. Surv., 31(3):264–323, September 1999.
10.
go back to reference T. Kohonen. The self-organizing map. Neurocomputing, 21(1-3):1–6, 1998. T. Kohonen. The self-organizing map. Neurocomputing, 21(1-3):1–6, 1998.
11.
go back to reference R. López de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991. R. López de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991.
12.
go back to reference O. Maimon and L. Rokach, editors. Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010. O. Maimon and L. Rokach, editors. Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010.
13.
go back to reference M. Núñez. The use of background knowledge in decision tree induction. Machine Learning, 6:231–250, 1991. M. Núñez. The use of background knowledge in decision tree induction. Machine Learning, 6:231–250, 1991.
14.
go back to reference J. Ortega and D. Fisher. Flexibly exploiting prior knowledge in empirical learning. In Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2, IJCAI’95, pages 1041–1047, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc. J. Ortega and D. Fisher. Flexibly exploiting prior knowledge in empirical learning. In Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2, IJCAI’95, pages 1041–1047, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.
15.
go back to reference M. J. Pazzani. Knowledge discovery from data? IEEE Intelligent Systems, 15(2):10–13, 2000. M. J. Pazzani. Knowledge discovery from data? IEEE Intelligent Systems, 15(2):10–13, 2000.
16.
go back to reference J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986. J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
17.
go back to reference J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
18.
go back to reference J. R. Quinlan. Discovering rules by induction from large collection of examples. In Expert Systems in the Microelectronic Age. D. Michie (Ed.), pages 168–201. Edimburg Eniversity Press, 1979. J. R. Quinlan. Discovering rules by induction from large collection of examples. In Expert Systems in the Microelectronic Age. D. Michie (Ed.), pages 168–201. Edimburg Eniversity Press, 1979.
19.
go back to reference R.L. Schalock and M.A. Verdugo. Handbook of quality of life for human service practitioners. Washington, DC, 2002. R.L. Schalock and M.A. Verdugo. Handbook of quality of life for human service practitioners. Washington, DC, 2002.
20.
go back to reference J. C. Shafer, R. Agrawal, and M. Mehta. Sprint: A scalable parallel classifier for data mining. In VLDB, pages 544–555, 1996. J. C. Shafer, R. Agrawal, and M. Mehta. Sprint: A scalable parallel classifier for data mining. In VLDB, pages 544–555, 1996.
21.
go back to reference S. M. Sivagama. A knowledge discovery using decision tree by Gini coefficient. In International Conference on Business, Engineering and Industrial Applications (ICBEIA), pages 232–235, 2011. S. M. Sivagama. A knowledge discovery using decision tree by Gini coefficient. In International Conference on Business, Engineering and Industrial Applications (ICBEIA), pages 232–235, 2011.
22.
go back to reference Y. Tsai, Paul H. King, Ph. D, Michael S. Higgins, Ph. D, and Nimesh P. Patel. An expert-guided decision tree construction strategy: An application in knowledge discovery with medical databases. In AMIA Annual Fall Symposium, pages 208–212, 1997. Y. Tsai, Paul H. King, Ph. D, Michael S. Higgins, Ph. D, and Nimesh P. Patel. An expert-guided decision tree construction strategy: An application in knowledge discovery with medical databases. In AMIA Annual Fall Symposium, pages 208–212, 1997.
Metadata
Title
Experiences Using Decision Trees for Knowledge Discovery
Authors
Eva Armengol
Àngel García-Cerdaña
Pilar Dellunde
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-47557-8_11

Premium Partner