Skip to main content

2017 | OriginalPaper | Buchkapitel

2. Algorithms for Redescription Mining

verfasst von : Esther Galbrun, Pauli Miettinen

Erschienen in: Redescription Mining

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The aim of redescription mining is to find valid redescriptions for given data, query language, similarity relation, and user-specified constraints. In other words, we need to explore the search space consisting of query pairs from the query language, looking for those pairs that have similar enough support in the data and that satisfy the other constraints. In this chapter, we present the different methods that have been proposed to carry out this exploration efficiently. Existing methods can be arranged into three main categories: (1) mine-and-pair approaches, (2) alternating approaches, and (3) approaches that use atomic updates. We consider each one in turn, explaining its general common principles and looking at different algorithms designed on these principles. Next, we compare the different methods and discuss their relative strengths and weaknesses. Finally, we consider how to adapt the algorithms to handle cases where some values are missing from the input data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The equation for the pessimistic Jaccard index presented by Galbrun and Miettinen (2012, Equation 5.7) is erroneous, as it misses two summands from the denominator.
 
Literatur
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB’94), pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB’94), pp 487–499
Zurück zum Zitat Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning (ICML’98), pp 55–63 Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning (ICML’98), pp 55–63
Zurück zum Zitat Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press, Boca Raton, FL Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press, Boca Raton, FL
Zurück zum Zitat Garey MR, Johnson DS (2002) Computers and intractability. A guide to the theory of NP-completeness, vol 29. W. H. Freeman and Co., San Francisco, CA Garey MR, Johnson DS (2002) Computers and intractability. A guide to the theory of NP-completeness, vol 29. W. H. Freeman and Co., San Francisco, CA
Zurück zum Zitat Kumar D (2007) Redescription mining: Algorithms and applications in bioinformatics. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University Kumar D (2007) Redescription mining: Algorithms and applications in bioinformatics. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University
Zurück zum Zitat Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceedings of the 1994 AAAI Workshop on Knowledge Discovery in Databases (KDD’94), pp 181–192 Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceedings of the 1994 AAAI Workshop on Knowledge Discovery in Databases (KDD’94), pp 181–192
Zurück zum Zitat Mihelčić M, Džeroski S, Lavrač N, Šmuc T (2016) Redescription mining with multi-target predictive clustering trees. In: Proceedings of the 4th International Workshop on the New Frontiers in Mining Complex Patterns (NFMCP’15), pp 125–143, https://doi.org/10.1007/978-3-319-39315-5_9 Mihelčić M, Džeroski S, Lavrač N, Šmuc T (2016) Redescription mining with multi-target predictive clustering trees. In: Proceedings of the 4th International Workshop on the New Frontiers in Mining Complex Patterns (NFMCP’15), pp 125–143, https://​doi.​org/​10.​1007/​978-3-319-39315-5_​9
Zurück zum Zitat Ramakrishnan N, Zaki MJ (2009) Redescription mining and applications in bioinformatics. In: Chen J, Lonardi S (eds) Biological Data Mining, Chapman and Hall/CRC, Boca Raton, FL Ramakrishnan N, Zaki MJ (2009) Redescription mining and applications in bioinformatics. In: Chen J, Lonardi S (eds) Biological Data Mining, Chapman and Hall/CRC, Boca Raton, FL
Zurück zum Zitat Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning CARTwheels: An alternating algorithm for mining redescriptions. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp 266–275, https://doi.org/10.1145/1014052.1014083 Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning CARTwheels: An alternating algorithm for mining redescriptions. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp 266–275, https://​doi.​org/​10.​1145/​1014052.​1014083
Zurück zum Zitat Zhao L, Zaki MJ, Ramakrishnan N (2006) BLOSOM: A framework for mining arbitrary Boolean expressions. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp 827–832, https://doi.org/10.1145/1150402.1150511 Zhao L, Zaki MJ, Ramakrishnan N (2006) BLOSOM: A framework for mining arbitrary Boolean expressions. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp 827–832, https://​doi.​org/​10.​1145/​1150402.​1150511
Metadaten
Titel
Algorithms for Redescription Mining
verfasst von
Esther Galbrun
Pauli Miettinen
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-72889-6_2