Skip to main content

2017 | OriginalPaper | Buchkapitel

Integer Linear Programming for Pattern Set Mining; with an Application to Tiling

verfasst von : Abdelkader Ouali, Albrecht Zimmermann, Samir Loudni, Yahia Lebbah, Bruno Cremilleux, Patrice Boizumault, Lakhdar Loukil

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Pattern set mining is an important part of a number of data mining tasks such as classification, clustering, database tiling, or pattern summarization. Efficiently mining pattern sets is a highly challenging task and most approaches use heuristic strategies. In this paper, we formulate the pattern set mining problem as an optimization task, ensuring that the produced solution is the best one from the entire search space. We propose a method based on integer linear programming (ILP) that is exhaustive, declarative and optimal. ILP solvers can exploit different constraint types to restrict the search space, and can use any pattern set measure (or combination thereof) as an objective function, allowing the user to focus on the optimal result. We illustrate and show the efficiency of our method by applying it to the tiling problem.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
The k-LTM implementation is Windows-only, and run times therefore only roughly indicate its behavior.
 
Literatur
2.
Zurück zum Zitat Bringmann, B., Zimmermann, A.: One in a million: picking the right patterns. Knowl. Inf. Syst. 18(1), 61–81 (2009)CrossRef Bringmann, B., Zimmermann, A.: One in a million: picking the right patterns. Knowl. Inf. Syst. 18(1), 61–81 (2009)CrossRef
3.
Zurück zum Zitat Cagliero, L., Chiusano, S., Garza, P., Bruno, G.: Pattern set mining with schema-based constraint. Knowl.-Based Syst. 84, 224–238 (2015)CrossRef Cagliero, L., Chiusano, S., Garza, P., Bruno, G.: Pattern set mining with schema-based constraint. Knowl.-Based Syst. 84, 224–238 (2015)CrossRef
4.
Zurück zum Zitat Cheng, H., Yan, X., Han, J., Hsu, C.-W.: Discriminative frequent pattern analysis for effective classification. In: ICDE 2007, Istanbul, Turkey, April 15, pp. 716–725 (2007) Cheng, H., Yan, X., Han, J., Hsu, C.-W.: Discriminative frequent pattern analysis for effective classification. In: ICDE 2007, Istanbul, Turkey, April 15, pp. 716–725 (2007)
5.
Zurück zum Zitat Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996, Portland, pp. 226–231 (1996) Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996, Portland, pp. 226–231 (1996)
7.
Zurück zum Zitat Guns, T., Nijssen, S., De Raedt, L.: k-pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)CrossRef Guns, T., Nijssen, S., De Raedt, L.: k-pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)CrossRef
8.
Zurück zum Zitat IBM/ILOG, Inc. ILOG CPLEX: High-performance software for mathematical programming and optimization (2016) IBM/ILOG, Inc. ILOG CPLEX: High-performance software for mathematical programming and optimization (2016)
9.
Zurück zum Zitat Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_26 CrossRef Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-40994-3_​26 CrossRef
10.
Zurück zum Zitat Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.): 50 Years of Integer Programming 1958–2008 - From the Early Years to the State-of-the-Art. Springer, Heidelberg (2010) Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.): 50 Years of Integer Programming 1958–2008 - From the Early Years to the State-of-the-Art. Springer, Heidelberg (2010)
11.
Zurück zum Zitat Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994) Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
12.
13.
Zurück zum Zitat Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: ACM SIGKDD 2006, Philadelphia, PA, USA, pp. 237–244 (2006) Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: ACM SIGKDD 2006, Philadelphia, PA, USA, pp. 237–244 (2006)
14.
Zurück zum Zitat Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rules mining. In: Proceedings of Fourth International Conference on Knowledge Discovery & Data Mining (KDD 1998), pp. 80–86, New York. AAAI Press (1998) Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rules mining. In: Proceedings of Fourth International Conference on Knowledge Discovery & Data Mining (KDD 1998), pp. 80–86, New York. AAAI Press (1998)
15.
Zurück zum Zitat Andrzej, J.O.: Integer and combinatorial optimization. Int. J. Adapt. Control Signal Process. 4(4), 333–334 (1990) Andrzej, J.O.: Integer and combinatorial optimization. Int. J. Adapt. Control Signal Process. 4(4), 333–334 (1990)
16.
Zurück zum Zitat Ouali, A., Loudni, S., Lebbah, Y., Boizumault, P., Zimmermann, A., Loukil, L.: Efficiently finding conceptual clustering models with integer linear programming. In: IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp. 647–654 (2016) Ouali, A., Loudni, S., Lebbah, Y., Boizumault, P., Zimmermann, A., Loukil, L.: Efficiently finding conceptual clustering models with integer linear programming. In: IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp. 647–654 (2016)
17.
Zurück zum Zitat Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1999). doi:10.1007/3-540-49257-7_25 CrossRef Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1999). doi:10.​1007/​3-540-49257-7_​25 CrossRef
18.
Zurück zum Zitat De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: SIAM 2007, 26–28 April 2007, Minneapolis, Minnesota, USA, pp. 237–248 (2007) De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: SIAM 2007, 26–28 April 2007, Minneapolis, Minnesota, USA, pp. 237–248 (2007)
19.
Zurück zum Zitat Shima, Y., Hirata, K., Harao, M.: Extraction of frequent few-overlapped monotone DNF formulas with depth-first pruning. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 50–60. Springer, Heidelberg (2005). doi:10.1007/11430919_8 CrossRef Shima, Y., Hirata, K., Harao, M.: Extraction of frequent few-overlapped monotone DNF formulas with depth-first pruning. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 50–60. Springer, Heidelberg (2005). doi:10.​1007/​11430919_​8 CrossRef
20.
Zurück zum Zitat Vreeken, J., van Leeuwen, M., Siebes, A.: KRIMP: mining itemsets that compress. Data Min. Knowl. Discov. 23(1), 169–214 (2011)MathSciNetCrossRefMATH Vreeken, J., van Leeuwen, M., Siebes, A.: KRIMP: mining itemsets that compress. Data Min. Knowl. Discov. 23(1), 169–214 (2011)MathSciNetCrossRefMATH
21.
Zurück zum Zitat Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: ACM SIGKDD 2006, Philadelphia, PA, USA, 20–23 August 2006, pp. 444–453 (2006) Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: ACM SIGKDD 2006, Philadelphia, PA, USA, 20–23 August 2006, pp. 444–453 (2006)
22.
Zurück zum Zitat Xindong, W., Vipin, K.: The Top Ten Algorithms in Data Mining, vol. 1. Chapman & Hall/CRC, Boca Raton (2009)MATH Xindong, W., Vipin, K.: The Top Ten Algorithms in Data Mining, vol. 1. Chapman & Hall/CRC, Boca Raton (2009)MATH
Metadaten
Titel
Integer Linear Programming for Pattern Set Mining; with an Application to Tiling
verfasst von
Abdelkader Ouali
Albrecht Zimmermann
Samir Loudni
Yahia Lebbah
Bruno Cremilleux
Patrice Boizumault
Lakhdar Loukil
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-57529-2_23

Premium Partner