Skip to main content
Top

2017 | OriginalPaper | Chapter

2. Algorithms for Redescription Mining

Authors : Esther Galbrun, Pauli Miettinen

Published in: Redescription Mining

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The aim of redescription mining is to find valid redescriptions for given data, query language, similarity relation, and user-specified constraints. In other words, we need to explore the search space consisting of query pairs from the query language, looking for those pairs that have similar enough support in the data and that satisfy the other constraints. In this chapter, we present the different methods that have been proposed to carry out this exploration efficiently. Existing methods can be arranged into three main categories: (1) mine-and-pair approaches, (2) alternating approaches, and (3) approaches that use atomic updates. We consider each one in turn, explaining its general common principles and looking at different algorithms designed on these principles. Next, we compare the different methods and discuss their relative strengths and weaknesses. Finally, we consider how to adapt the algorithms to handle cases where some values are missing from the input data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The equation for the pessimistic Jaccard index presented by Galbrun and Miettinen (2012, Equation 5.7) is erroneous, as it misses two summands from the denominator.
 
Literature
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB’94), pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB’94), pp 487–499
go back to reference Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning (ICML’98), pp 55–63 Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning (ICML’98), pp 55–63
go back to reference Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press, Boca Raton, FL Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press, Boca Raton, FL
go back to reference Garey MR, Johnson DS (2002) Computers and intractability. A guide to the theory of NP-completeness, vol 29. W. H. Freeman and Co., San Francisco, CA Garey MR, Johnson DS (2002) Computers and intractability. A guide to the theory of NP-completeness, vol 29. W. H. Freeman and Co., San Francisco, CA
go back to reference Kumar D (2007) Redescription mining: Algorithms and applications in bioinformatics. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University Kumar D (2007) Redescription mining: Algorithms and applications in bioinformatics. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University
go back to reference Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceedings of the 1994 AAAI Workshop on Knowledge Discovery in Databases (KDD’94), pp 181–192 Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceedings of the 1994 AAAI Workshop on Knowledge Discovery in Databases (KDD’94), pp 181–192
go back to reference Ramakrishnan N, Zaki MJ (2009) Redescription mining and applications in bioinformatics. In: Chen J, Lonardi S (eds) Biological Data Mining, Chapman and Hall/CRC, Boca Raton, FL Ramakrishnan N, Zaki MJ (2009) Redescription mining and applications in bioinformatics. In: Chen J, Lonardi S (eds) Biological Data Mining, Chapman and Hall/CRC, Boca Raton, FL
go back to reference Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning CARTwheels: An alternating algorithm for mining redescriptions. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp 266–275, https://doi.org/10.1145/1014052.1014083 Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning CARTwheels: An alternating algorithm for mining redescriptions. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp 266–275, https://​doi.​org/​10.​1145/​1014052.​1014083
Metadata
Title
Algorithms for Redescription Mining
Authors
Esther Galbrun
Pauli Miettinen
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-72889-6_2

Premium Partner