Skip to main content
Log in

An efficient algorithm for mining periodic high-utility sequential patterns

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

A periodic high-utility sequential pattern (PHUSP) is a pattern that not only yields a high-utility (e.g. high profit) but also appears regularly in a sequence database. Finding PHUSPs is useful for several applications such as market basket analysis, where it can reveal recurring and profitable customer behavior. Although discovering PHUSPs is desirable, it is computationally difficult. To discover PHUSPs efficiently, this paper proposes a structure for periodic high-utility sequential pattern mining (PHUSPM) named PUSP. Furthermore, to reduce the search space and speed up PHUSPM, a pruning strategy is developed. This results in an efficient algorithm called periodic high-utility sequential pattern optimal miner (PUSOM). An experimental evaluation was performed on both synthetic and real-life datasets to compare the performance of PUSOM with state-of-the-art PHUSPM algorithms in terms of execution time, memory usage and scalability. Experimental results show that the PUSOM algorithm can efficiently discover the complete set of PHUSPs. Moreover, it outperforms the other four algorithms as the former can prune many unpromising patterns using its designed structure and pruning strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering, 1995, IEEE, pp 3–14. https://doi.org/10.1109/icde.1995.380415

  2. Ahmed CF, Tanbeer SK, Jeong B-S (2010) A novel approach for mining high-utility sequential patterns in sequence databases. ETRI J 32(5):676–686. https://doi.org/10.4218/etrij.10.1510.0066

    Article  Google Scholar 

  3. Amphawan K, Lenca P, Surarerks A (2009) Mining top-k periodic-frequent pattern from transactional databases without support threshold. In: Advances in information technology, pp 18–29. https://doi.org/10.1007/978-3-642-10392-6_3

    Chapter  Google Scholar 

  4. Dinh T, Huynh V-N, Le B (2017) Mining periodic high utility sequential patterns. In: Asian conference on intelligent information and database systems. Springer, Berlin, pp 545–555. https://doi.org/10.1007/978-3-319-54472-4_51

    Chapter  Google Scholar 

  5. Dinh T, Quang MN, Le B (2015). In: Proceedings of the sixth international symposium on information and communication technology. ACM, A novel approach for hiding high utility sequential patterns, pp 121–128. https://doi.org/10.1145/2833258.2833271

  6. Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu C-W, Tseng VS et al (2014) SPMF: a Java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393. https://doi.org/10.1007/978-3-319-46131-1_8

    MATH  Google Scholar 

  7. Fournier-Viger J, Lin C-W, Dinh T, Le HB (2016) Mining correlated high-utility itemsets using the bond measure. In: International conference on hybrid artificial intelligence systems. Springer, pp 53–65. https://doi.org/10.1007/978-3-319-32034-2_5

    Google Scholar 

  8. Fournier-Viger P, Lin JC-W, Duong Q-H, Dam T-L (2016) Phm: mining periodic high-utility itemsets. In: Industrial Conference on Data Mining. Springer, pp 64–79. https://doi.org/10.1007/978-3-319-41561-1_6

    Chapter  Google Scholar 

  9. Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Pattern Recognit 1(1):54–77

    Google Scholar 

  10. Fournier-Viger P, Zhang Y, Lin JC-W, Dinh D-T, Le HB (2018) Mining correlated high-utility itemsets using various measures. Logic J IGPL

  11. Fradkin D, Mörchen F (2015) Mining sequential patterns for classification. Knowl Inf Syst 45(3):731–749. https://doi.org/10.1007/s10115-014-0817-0

    Article  Google Scholar 

  12. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Hong T-P, Fujita H (2018) A survey of incremental high-utility itemset mining. Wiley Interdiscip Rev: Data Min Knowl Discov 8(2):e1242. https://doi.org/10.1002/widm.1242

    Google Scholar 

  13. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Tseng VS, Yu PS (2018) A survey of utility-oriented pattern mining. arXiv:1805.10511

  14. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Yu PS (2018) A Survey of Parallel Sequential Pattern Mining. arXiv:1805.10515

  15. Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: 15th international conference on data engineering, 1999. Proceedings. IEEE, pp 106–115. https://doi.org/10.1109/icde.1999.754913

  16. Kiran RU, Kitsuregawa M, Reddy PK (2016) Efficient discovery of periodic-frequent patterns in very large databases. J Syst Softw 112:110–121. https://doi.org/10.1016/j.jss.2015.10.035

    Article  Google Scholar 

  17. Kiran RU, Venkatesh JN, Toyoda M, Kitsuregawa M, Reddy PK (2017) Discovering partial periodic-frequent patterns in a transactional database. J Syst Softw 125:170–182. https://doi.org/10.1016/j.jss.2016.11.035

    Article  Google Scholar 

  18. Lan G-C, Hong T-P, Tseng VS, Wang S-L (2014) Applying the maximum utility measure in high utility sequential pattern mining. Exp Syst Appl 41(11):5071–5081. https://doi.org/10.1016/j.eswa.2014.02.022

    Article  Google Scholar 

  19. Le B, Dinh D-T, Huynh V-N, Nguyen Q-M, Fournier-Viger P (2018) An efficient algorithm for hiding high utility sequential patterns. Int J Approx Reason. https://doi.org/10.1016/j.ijar.2018.01.005

    Article  MathSciNet  Google Scholar 

  20. Le B, Huynh U, Dinh D-T (2018) A pure array structure and parallel strategy for high-utility sequential pattern mining. Exp Syst Appl 104:107–120. https://doi.org/10.1016/j.eswa.2018.03.019

    Article  Google Scholar 

  21. Lin JC-W, Zhang J, Fournier-Viger P (2017) High-utility sequential pattern mining with multiple minimum utility thresholds. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) joint conference on web and big data. Springer, pp 215–229. https://doi.org/10.1007/978-3-319-63579-8_17

    Chapter  Google Scholar 

  22. Lin JC-W, Zhang J, Fournier-Viger P (2017) High-utility sequential pattern mining with multiple minimum utility thresholds. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) joint conference on web and big data. Springer, pp 215–229. https://doi.org/10.1109/smc.2016.7844710

  23. Lin JC-W, Zhang J, Fournier-Viger P, Hong T-P, Zhang J (2017) A two-phase approach to mine short-period high-utility itemsets in transactional databases. Adv Eng Inform 33:29–43. https://doi.org/10.1016/j.aei.2017.04.007

    Article  Google Scholar 

  24. Perera D, Kay J, Koprinska I, Yacef K, Zaïane OR (2009) Clustering and sequential pattern mining of online collaborative learning data. IEEE Trans Knowl Data Eng 21(6):759–772. https://doi.org/10.1109/tkde.2008.138

    Article  Google Scholar 

  25. Quang MN, Dinh T, Huynh U, Le B (2016) Mhhusp: an integrated algorithm for mining and hiding high utility sequential patterns. In: 2016 Eighth international conference on knowledge and systems engineering (KSE). IEEE, pp 13–18. https://doi.org/10.1109/kse.2016.7758022

  26. Quang MN, Huynh U, Dinh T, Le NH, Le B (2016) An approach to decrease execution time and difference for hiding high utility sequential patterns. In: Integrated uncertainty in knowledge modelling and decision making: 5th international symposium, IUKM 2016, Da Nang, Vietnam, November 30–December 2, 2016, Proceedings 5. Springer, pp 435–446. https://doi.org/10.1007/978-3-319-49046-5_37

    Google Scholar 

  27. Shie B-E, Cheng J-H, Chuang K-T, Tseng VS (2012) A one-phase method for mining high utility mobile sequential patterns in mobile commerce environments. In: Proceedings of 25th international conference on industrial engineering and other applications of applied intelligent systems. Springer, pp 616–626. https://doi.org/10.1007/978-3-642-31087-4_63

    Chapter  Google Scholar 

  28. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Advances in Database Technology–EDBT’96, pp 1–17. https://doi.org/10.1007/bfb0014140

    Google Scholar 

  29. Surana A, Kiran RU, Reddy PK (2012) An efficient approach to mine periodic-frequent patterns in transactional databases. Springer, Berlin, pp 254–266. https://doi.org/10.1007/978-3-642-28320-8_22

    Google Scholar 

  30. Tanbeer SK, Ahmed CF, Jeong B-S, Lee Y-K (2009) Discovering periodic-frequent patterns in transactional databases. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 242–253. https://doi.org/10.1007/978-3-642-01307-2_24

    Chapter  Google Scholar 

  31. Wang J-Z, Huang J-L, Chen Y-C (2016) On efficiently mining high utility sequential patterns. Knowl Inf Syst 49(2):597–627. https://doi.org/10.1007/s10115-015-0914-8

    Article  Google Scholar 

  32. Wright AP, Wright AT, McCoy AB, Sittig DF (2015) The use of sequential pattern mining to predict next prescribed medications. J Biomed Inform 53:73–80. https://doi.org/10.1016/j.jbi.2014.09.003

    Article  Google Scholar 

  33. Wu Y, Wang L, Ren J, Ding W, Wu X (2014) Mining sequential patterns with periodic wildcard gaps. Appl Intell 41(1):99–116. https://doi.org/10.1007/s10489-013-0499-4

    Article  Google Scholar 

  34. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM international conference on data mining. SIAM, pp 482–486. https://doi.org/10.1137/1.9781611972740.51

    Chapter  Google Scholar 

  35. Yin J, Zheng Z, Cao L (2012) USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 660–668. https://doi.org/10.1145/2339530.2339636

  36. Yin J, Zheng Z, Cao L, Song Y, Wei W (2013) Efficiently mining top-k high utility sequential patterns. In: 2013 IEEE 13th international conference on data mining (ICDM). IEEE, pp 1259–1264. https://doi.org/10.1109/icdm.2013.148

  37. Yu X, Zhang Z, Yu H, Jiang F, Ji W (2015) An asynchronous periodic sequential pattern mining algorithm with multiple minimum item supports for ad hoc networking. J Sens. https://doi.org/10.1155/2015/461659

    Google Scholar 

  38. Yun U, Kim D, Yoon E, Fujita H (2017) Damped window based high average utility pattern mining over data streams. Knowl-Based Syst https://doi.org/10.1016/j.knosys.2017.12.029

    Article  Google Scholar 

  39. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowledge-Based Syst 124:188–206. https://doi.org/10.1016/j.knosys.2017.03.016

    Article  Google Scholar 

  40. Zihayat M, Chen Y, An A (2017) Memory-adaptive high utility sequential pattern mining over data streams. Mach Learn 106(6):799–836. https://doi.org/10.1007/s10994-016-5617-1

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.05-2015.07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Van-Nam Huynh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dinh, DT., Le, B., Fournier-Viger, P. et al. An efficient algorithm for mining periodic high-utility sequential patterns. Appl Intell 48, 4694–4714 (2018). https://doi.org/10.1007/s10489-018-1227-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1227-x

Keywords

Navigation