Skip to main content
Top

2019 | OriginalPaper | Chapter

pysubgroup: Easy-to-Use Subgroup Discovery in Python

Authors : Florian Lemmerich, Martin Becker

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper introduces the pysubgroup package for subgroup discovery in Python. Subgroup discovery is a well-established data mining task that aims at identifying describable subsets in the data that show an interesting distribution with respect to a certain target concept. The presented package provides an easy-to-use, compact and extensible implementation of state-of-the-art mining algorithms, interestingness measures, and visualizations. Since it builds directly on the established pandas data analysis library—a de-facto standard for data science in Python—it seamlessly integrates into preprocessing and exploratory data analysis steps. Code related to this paper is available at: http://​florian.​lemmerich.​net/​pysubgroup.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Atzmueller, M.: Subgroup discovery. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 5(1), 35–49 (2015)CrossRef Atzmueller, M.: Subgroup discovery. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 5(1), 35–49 (2015)CrossRef
3.
go back to reference Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989) Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)
4.
go back to reference Flach, P.A.: The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: International Conference on Machine Learning, pp. 194–201 (2003) Flach, P.A.: The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: International Conference on Machine Learning, pp. 194–201 (2003)
5.
go back to reference Herrera, F., Carmona, C.J., González, P., Del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. Knowl. Inf. Syst. 29(3), 495–525 (2010)CrossRef Herrera, F., Carmona, C.J., González, P., Del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. Knowl. Inf. Syst. 29(3), 495–525 (2010)CrossRef
6.
go back to reference Kavšek, B., Lavrač, N.: APRIORI-SD: adapting association rule learning to subgroup discovery. Appl. Artif. Intell. 20(7), 543–583 (2006)CrossRef Kavšek, B., Lavrač, N.: APRIORI-SD: adapting association rule learning to subgroup discovery. Appl. Artif. Intell. 20(7), 543–583 (2006)CrossRef
7.
go back to reference Klösgen, W.: Explora: a multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271. American Association for Artificial Intelligence (1996) Klösgen, W.: Explora: a multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271. American Association for Artificial Intelligence (1996)
9.
go back to reference Lemmerich, F., Rohlfs, M., Atzmueller, M.: Fast discovery of relevant subgroup patterns. In: International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 428–433 (2010) Lemmerich, F., Rohlfs, M., Atzmueller, M.: Fast discovery of relevant subgroup patterns. In: International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 428–433 (2010)
10.
go back to reference Meeng, M., Knobbe, A.: Flexible enrichment with Cortana-software demo. In: Proceedings of BeneLearn, pp. 117–119 (2011) Meeng, M., Knobbe, A.: Flexible enrichment with Cortana-software demo. In: Proceedings of BeneLearn, pp. 117–119 (2011)
11.
go back to reference Singer, P., et al.: Why we read Wikipedia. In: International Conference on World Wide Web (WWW), pp. 1591–1600 (2017) Singer, P., et al.: Why we read Wikipedia. In: International Conference on World Wide Web (WWW), pp. 1591–1600 (2017)
13.
go back to reference Zimmermann, A., De Raedt, L.: Cluster-grouping: from subgroup discovery to clustering. Mach. Learn. 77(1), 125–159 (2009)CrossRef Zimmermann, A., De Raedt, L.: Cluster-grouping: from subgroup discovery to clustering. Mach. Learn. 77(1), 125–159 (2009)CrossRef
Metadata
Title
pysubgroup: Easy-to-Use Subgroup Discovery in Python
Authors
Florian Lemmerich
Martin Becker
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-10997-4_46

Premium Partner