Skip to main content

2018 | OriginalPaper | Buchkapitel

A Parallel Algorithm of Mining Frequent Pattern on Uncertain Data Streams

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

At present, more and more data are generated every day and the actual application requirements for the mining algorithm efficiency have become higher. In such a situation, one of the hot research topics on the frequent pattern mining over uncertain data is the spatiotemporal efficiency improvement of mining algorithms. Aiming at solving the frequent pattern mining problems over dynamic uncertain data streams, based on the existing algorithm researches, the paper proposes a parallel mining approximation algorithm based on the MapReduce framework by combining a highly efficient algorithm for static data. If this algorithm is used to mine frequent patterns, all the frequent patterns can be mined from a sliding window by using MapReduce at most twice. In the experiments conducted for this paper, in most cases the frequent item set was accurately discovered after MapReduce is used once. The experiments have shown that the spatiotemporal efficiency of the algorithm proposed in this paper is much better than those of the other algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Rawat, R., Jain, N.: A survey on frequent itemset mining over data stream. Int. J. Electron. Commun. Comput. Eng. (IJECCE) 4(1), 86–87 (2013) Rawat, R., Jain, N.: A survey on frequent itemset mining over data stream. Int. J. Electron. Commun. Comput. Eng. (IJECCE) 4(1), 86–87 (2013)
2.
Zurück zum Zitat Leung, C.K.-S., Jiang, F.: Frequent itemset mining of uncertain data streams using the damped window model. In: Proceedings of 26th Annual ACM Symposium on Applied Computing (SAC 2011), TaiChung, Taiwan, pp. 950–955 (2011) Leung, C.K.-S., Jiang, F.: Frequent itemset mining of uncertain data streams using the damped window model. In: Proceedings of 26th Annual ACM Symposium on Applied Computing (SAC 2011), TaiChung, Taiwan, pp. 950–955 (2011)
3.
Zurück zum Zitat Leung, C.K.-S., Jiang, F.: Frequent pattern mining from time-fading streams of uncertain data. In: Proceedings 13th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2011), Toulouse, France, pp. 252–264 (2011) Leung, C.K.-S., Jiang, F.: Frequent pattern mining from time-fading streams of uncertain data. In: Proceedings 13th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2011), Toulouse, France, pp. 252–264 (2011)
4.
Zurück zum Zitat Leung, C.K.-S., Hao, B.: Mining of frequent itemsets from streams of uncertain data. In: Proceedings of International Conference on Data Engineering, Shanghai, China, pp. 1663–1670 (2009) Leung, C.K.-S., Hao, B.: Mining of frequent itemsets from streams of uncertain data. In: Proceedings of International Conference on Data Engineering, Shanghai, China, pp. 1663–1670 (2009)
5.
Zurück zum Zitat Wang, L., Feng, L., Wu, M.: AT-Mine: an efficient algorithm of frequent itemset mining on uncertain dataset. J. Comput. 8(6), 1417–1426 (2013)CrossRef Wang, L., Feng, L., Wu, M.: AT-Mine: an efficient algorithm of frequent itemset mining on uncertain dataset. J. Comput. 8(6), 1417–1426 (2013)CrossRef
6.
Zurück zum Zitat Cryans, J.-D., Ratte, S., Champagne, R.: Adaptation of apriori to MapReduce to build a warehouse of relations between named entities across the web. In: Proceedings of 2nd International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA 2010), Menuires, France, pp. 185–189 (2010) Cryans, J.-D., Ratte, S., Champagne, R.: Adaptation of apriori to MapReduce to build a warehouse of relations between named entities across the web. In: Proceedings of 2nd International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA 2010), Menuires, France, pp. 185–189 (2010)
7.
Zurück zum Zitat Yang, X.Y., Liu, Z., Fu, Y.: MapReduce as a programming model for association rules algorithm on Hadoop. In: Proceedings of 3rd International Conference on Information Sciences and Interaction Sciences, Chengdu, China, pp. 99–102 (2010) Yang, X.Y., Liu, Z., Fu, Y.: MapReduce as a programming model for association rules algorithm on Hadoop. In: Proceedings of 3rd International Conference on Information Sciences and Interaction Sciences, Chengdu, China, pp. 99–102 (2010)
8.
Zurück zum Zitat Riondato, M., DeBrabant, J.A., Fonsecaetal, R.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), Maui, HI, USA, pp. 85–94 (2012) Riondato, M., DeBrabant, J.A., Fonsecaetal, R.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), Maui, HI, USA, pp. 85–94 (2012)
9.
Zurück zum Zitat Xiao, T., Yuan, C., Huang, Y.: PSON: a parallelized SON algorithm with MapReduce for mining frequent sets. In: Proceedings of 2011 4th International Symposium on Parallel Architectures, Algorithms and Programming, Tianjin, China, pp. 252–257 (2011) Xiao, T., Yuan, C., Huang, Y.: PSON: a parallelized SON algorithm with MapReduce for mining frequent sets. In: Proceedings of 2011 4th International Symposium on Parallel Architectures, Algorithms and Programming, Tianjin, China, pp. 252–257 (2011)
10.
Zurück zum Zitat Li, H., Wang, Y., Zhangetal, D.: PFP: parallel FP-growth for query recommendation. In: Proceedings of 2008 2nd ACM International Conference on Recommender Systems25, 2008, Lausanne, Switzerland, pp. 107–114 (2008) Li, H., Wang, Y., Zhangetal, D.: PFP: parallel FP-growth for query recommendation. In: Proceedings of 2008 2nd ACM International Conference on Recommender Systems25, 2008, Lausanne, Switzerland, pp. 107–114 (2008)
11.
Zurück zum Zitat Lin, C.W., Hong, T.P.: A new mining approach for uncertain databases using CUFP trees. Expert Syst. Appl. 39(4), 4084–4093 (2012)CrossRef Lin, C.W., Hong, T.P.: A new mining approach for uncertain databases using CUFP trees. Expert Syst. Appl. 39(4), 4084–4093 (2012)CrossRef
12.
Zurück zum Zitat Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21(5), 609–623 (2009)CrossRef Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21(5), 609–623 (2009)CrossRef
13.
Zurück zum Zitat Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Proceedings of 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2008), Osaka, Japan, pp. 653–661 (2008) Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Proceedings of 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2008), Osaka, Japan, pp. 653–661 (2008)
14.
Zurück zum Zitat Sun, X., Lim, L., Wang, S.: An approximation algorithm of mining frequent itemsets from uncertain dataset. Int. J. Adv. Comput. Technol. 4(3), 42–49 (2012) Sun, X., Lim, L., Wang, S.: An approximation algorithm of mining frequent itemsets from uncertain dataset. Int. J. Adv. Comput. Technol. 4(3), 42–49 (2012)
15.
Zurück zum Zitat Calders, T., Garboni, C., Goethals, B.: Approximation of frequentness probability of itemsets in uncertain data. In: Proceedings of IEEE International Conference on Data Mining (ICDM 2010), Sydney, NSW, Australia, pp. 749–754 (2010) Calders, T., Garboni, C., Goethals, B.: Approximation of frequentness probability of itemsets in uncertain data. In: Proceedings of IEEE International Conference on Data Mining (ICDM 2010), Sydney, NSW, Australia, pp. 749–754 (2010)
16.
Zurück zum Zitat Wang, L., Cheung, D.W., Chengetal, R.: Efficient mining of frequent itemsets on large uncertain databases. IEEE Trans. Knowl. Data Eng. 24(12), 2170–2183 (2012)CrossRef Wang, L., Cheung, D.W., Chengetal, R.: Efficient mining of frequent itemsets on large uncertain databases. IEEE Trans. Knowl. Data Eng. 24(12), 2170–2183 (2012)CrossRef
17.
Zurück zum Zitat Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncertain data. In: Proceedings of IEEE International Conference on Data Mining Workshops (ICDM Workshops 2007), Omaha, NE, USA, pp. 489–494 (2007) Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncertain data. In: Proceedings of IEEE International Conference on Data Mining Workshops (ICDM Workshops 2007), Omaha, NE, USA, pp. 489–494 (2007)
18.
Zurück zum Zitat Aggarwal, C.C., Li, Y., Wangetal, J.: Frequent pattern mining with uncertain data. In: Proceedings of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, France, pp. 29–37 (2009) Aggarwal, C.C., Li, Y., Wangetal, J.: Frequent pattern mining with uncertain data. In: Proceedings of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, France, pp. 29–37 (2009)
19.
Metadaten
Titel
A Parallel Algorithm of Mining Frequent Pattern on Uncertain Data Streams
verfasst von
Yanfen Chang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-67071-3_47