Skip to main content
Erschienen in: Artificial Intelligence Review 3/2020

28.05.2019

A review of alignment based similarity measures for web usage mining

verfasst von: Vinh-Trung Luu, Germain Forestier, Jonathan Weber, Paul Bourgeois, Fahima Djelil, Pierre-Alain Muller

Erschienen in: Artificial Intelligence Review | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In order to understand web-based application user behavior, web usage mining applies unsupervised learning techniques to discover hidden patterns from web data that captures user browsing on web sites. For this purpose, web session clustering has been among the most popular approaches to group users with similar browsing patterns that reflect their common interest. An adequate web session clustering implementation significantly depends on the measure that is used to evaluate the similarity of sessions. An efficient approach to evaluate session similarity is sequence alignment, which is known as the task of determining the similarity of elements between sequences. In this paper, we review and compare sequence alignment-based measures for web sessions, and also discuss sequence similarity measures that are not alignment-based. This review also provides a perspective of sequence similarity measures that manipulate web sessions in usage clustering process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Algiriyage N, Jayasena S, Dias G (2015) Web user profiling using hierarchical clustering with improved similarity measure. In: Moratuwa engineering research conference (MERCon). IEEE, pp 295–300 Algiriyage N, Jayasena S, Dias G (2015) Web user profiling using hierarchical clustering with improved similarity measure. In: Moratuwa engineering research conference (MERCon). IEEE, pp 295–300
Zurück zum Zitat Anupama D, Gowda SD (2015) Clustering of web user sessions to maintain occurrence of sequence in navigation pattern. Procedia Comput Sci 58:558–564CrossRef Anupama D, Gowda SD (2015) Clustering of web user sessions to maintain occurrence of sequence in navigation pattern. Procedia Comput Sci 58:558–564CrossRef
Zurück zum Zitat Aruk T, Ustek D, Kursun O (2012) A comparative analysis of smith-waterman based partial alignment. In: IEEE symposium on computers and communications (ISCC). IEEE, pp 000250–000252 Aruk T, Ustek D, Kursun O (2012) A comparative analysis of smith-waterman based partial alignment. In: IEEE symposium on computers and communications (ISCC). IEEE, pp 000250–000252
Zurück zum Zitat Azimpour-Kivi M, Azmi R (2011) A webpage similarity measure for web sessions clustering using sequence alignment. In: International symposium on artificial intelligence and signal processing (AISP). IEEE, pp 20–24 Azimpour-Kivi M, Azmi R (2011) A webpage similarity measure for web sessions clustering using sequence alignment. In: International symposium on artificial intelligence and signal processing (AISP). IEEE, pp 20–24
Zurück zum Zitat Banerjee A, Ghosh J (2001) Clickstream clustering using weighted longest common subsequences. In: Proceedings of the web mining workshop at the 1st SIAM conference on data mining, vol 143. Citeseer, p 144 Banerjee A, Ghosh J (2001) Clickstream clustering using weighted longest common subsequences. In: Proceedings of the web mining workshop at the 1st SIAM conference on data mining, vol 143. Citeseer, p 144
Zurück zum Zitat Barton C, Flouri T, Iliopoulos CS, Pissis SP (2015) Global and local sequence alignment with a bounded number of gaps. Theor Comput Sci 582:1–16MathSciNetMATHCrossRef Barton C, Flouri T, Iliopoulos CS, Pissis SP (2015) Global and local sequence alignment with a bounded number of gaps. Theor Comput Sci 582:1–16MathSciNetMATHCrossRef
Zurück zum Zitat Bose RJC, van der Aalst WM (2012) Process diagnostics using trace alignment: opportunities, issues, and challenges. Inf Syst 37(2):117–141CrossRef Bose RJC, van der Aalst WM (2012) Process diagnostics using trace alignment: opportunities, issues, and challenges. Inf Syst 37(2):117–141CrossRef
Zurück zum Zitat Bouguessa M (2011) A practical approach for clustering transaction data. In: Machine learning and data mining in pattern recognition. Springer, pp 265–279 Bouguessa M (2011) A practical approach for clustering transaction data. In: Machine learning and data mining in pattern recognition. Springer, pp 265–279
Zurück zum Zitat Breitinger F, Baier H (2012) A fuzzy hashing approach based on random sequences and hamming distance. In: Proceedings of the conference on digital forensics, security and law. Association of Digital Forensics, Security and Law, p 89 Breitinger F, Baier H (2012) A fuzzy hashing approach based on random sequences and hamming distance. In: Proceedings of the conference on digital forensics, security and law. Association of Digital Forensics, Security and Law, p 89
Zurück zum Zitat Bucka-Lassen K, Caprani O, Hein J (1999) Combining many multiple alignments in one improved alignment. Bioinformatics (Oxford, England) 15(2):122–130CrossRef Bucka-Lassen K, Caprani O, Hein J (1999) Combining many multiple alignments in one improved alignment. Bioinformatics (Oxford, England) 15(2):122–130CrossRef
Zurück zum Zitat Buscaldi D, Tournier R, Aussenac-Gilles N, Mothe J (2012) Irit: textual similarity combining conceptual similarity with an n-gram comparison method. In: Proceedings of the first joint conference on lexical and computational semantics-volume 1: proceedings of the main conference and the shared task, and volume 2: proceedings of the sixth international workshop on semantic evaluation. Association for Computational Linguistics, pp 552–556 Buscaldi D, Tournier R, Aussenac-Gilles N, Mothe J (2012) Irit: textual similarity combining conceptual similarity with an n-gram comparison method. In: Proceedings of the first joint conference on lexical and computational semantics-volume 1: proceedings of the main conference and the shared task, and volume 2: proceedings of the sixth international workshop on semantic evaluation. Association for Computational Linguistics, pp 552–556
Zurück zum Zitat Chakraborty A, Bandyopadhyay S (2013a) Clustering of web sessions by fogsaa. In: IEEE recent advances in intelligent computational systems (RAICS). IEEE, pp 282–287 Chakraborty A, Bandyopadhyay S (2013a) Clustering of web sessions by fogsaa. In: IEEE recent advances in intelligent computational systems (RAICS). IEEE, pp 282–287
Zurück zum Zitat Chakraborty A, Bandyopadhyay S (2013b) FOGSAA: fast optimal global sequence alignment algorithm. Sci Rep 3:1746CrossRef Chakraborty A, Bandyopadhyay S (2013b) FOGSAA: fast optimal global sequence alignment algorithm. Sci Rep 3:1746CrossRef
Zurück zum Zitat Chaofeng L (2009) Research on web session clustering. J Softw 4(5):460–468 Chaofeng L (2009) Research on web session clustering. J Softw 4(5):460–468
Zurück zum Zitat Chitraa V, Thanamni AS (2012) An enhanced clustering technique for web usage mining. Int J Eng Res Technol 1:1–5CrossRef Chitraa V, Thanamni AS (2012) An enhanced clustering technique for web usage mining. Int J Eng Res Technol 1:1–5CrossRef
Zurück zum Zitat Chordia BS, Adhiya KP (2011) Grouping web access sequences using sequence alignment method. Indian J Comput Sci Eng (IJCSE) 2(3):308–314 Chordia BS, Adhiya KP (2011) Grouping web access sequences using sequence alignment method. Indian J Comput Sci Eng (IJCSE) 2(3):308–314
Zurück zum Zitat Daily J (2016) Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinform 17(1):81MathSciNetCrossRef Daily J (2016) Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinform 17(1):81MathSciNetCrossRef
Zurück zum Zitat Della Vedova G (2000) Multiple sequence alignment and phylogenetic reconstruction: theory and methods in biological data analysis. Ph.D. thesis, Citeseer Della Vedova G (2000) Multiple sequence alignment and phylogenetic reconstruction: theory and methods in biological data analysis. Ph.D. thesis, Citeseer
Zurück zum Zitat Delmestri A, Cristianini N (2010) String similarity measures and PAM-like matrices for cognate identification. UOB-ISLTR2010 Delmestri A, Cristianini N (2010) String similarity measures and PAM-like matrices for cognate identification. UOB-ISLTR2010
Zurück zum Zitat Deza MM, Deza E (2013) Distances and similarities in data analysis. In: Encyclopedia of distances. Springer, pp 291–305 Deza MM, Deza E (2013) Distances and similarities in data analysis. In: Encyclopedia of distances. Springer, pp 291–305
Zurück zum Zitat Dhandi M, Chakrawarti RK (2016) A comprehensive study of web usage mining. In: Symposium on colossal data analysis and networking (CDAN). IEEE, pp 1–5 Dhandi M, Chakrawarti RK (2016) A comprehensive study of web usage mining. In: Symposium on colossal data analysis and networking (CDAN). IEEE, pp 1–5
Zurück zum Zitat Di Tommaso P, Moretti S, Xenarios I, Orobitg M, Montanyola A, Chang JM, Taly JF, Notredame C (2011) T-coffee: a web server for the multiple sequence alignment of protein and rna sequences using structural information and homology extension. Nucleic Acids Res 39(suppl-2):W13–W17 Di Tommaso P, Moretti S, Xenarios I, Orobitg M, Montanyola A, Chang JM, Taly JF, Notredame C (2011) T-coffee: a web server for the multiple sequence alignment of protein and rna sequences using structural information and homology extension. Nucleic Acids Res 39(suppl-2):W13–W17
Zurück zum Zitat Dimopoulos C, Makris C, Panagis Y, Theodoridis E, Tsakalidis A (2010) A web page usage prediction scheme using sequence indexing and clustering techniques. Data Knowl Eng 69(4):371–382CrossRef Dimopoulos C, Makris C, Panagis Y, Theodoridis E, Tsakalidis A (2010) A web page usage prediction scheme using sequence indexing and clustering techniques. Data Knowl Eng 69(4):371–382CrossRef
Zurück zum Zitat Eddy SR (2004) What is a hidden markov model? Nat Biotechnol 22(10):1315–1316CrossRef Eddy SR (2004) What is a hidden markov model? Nat Biotechnol 22(10):1315–1316CrossRef
Zurück zum Zitat Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797CrossRef Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797CrossRef
Zurück zum Zitat El Azab A, Mahmood MA, El-Aziz A (2017) Effectiveness of web usage mining techniques in business application. The dark web: breakthroughs in research and practice, p 227 El Azab A, Mahmood MA, El-Aziz A (2017) Effectiveness of web usage mining techniques in business application. The dark web: breakthroughs in research and practice, p 227
Zurück zum Zitat Gauch S, Speretta M, Chandramouli A, Micarelli A (2007) User profiles for personalized information access. In: The adaptive web. Springer, pp 54–89 Gauch S, Speretta M, Chandramouli A, Micarelli A (2007) User profiles for personalized information access. In: The adaptive web. Springer, pp 54–89
Zurück zum Zitat Gonnet GH, Benner SA (1996) Probabilistic ancestral sequences and multiple alignments. In: Scandinavian workshop on algorithm theory. Springer, pp 380–391 Gonnet GH, Benner SA (1996) Probabilistic ancestral sequences and multiple alignments. In: Scandinavian workshop on algorithm theory. Springer, pp 380–391
Zurück zum Zitat Gündüz Ş, Özsu MT (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 535–540 Gündüz Ş, Özsu MT (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 535–540
Zurück zum Zitat Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. Knowl Inf Syst 6(2):150–163CrossRef Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. Knowl Inf Syst 6(2):150–163CrossRef
Zurück zum Zitat Higgins D (1997) Multiple sequence alignment. In: Genetic databases. Elsevier, pp 165–183 Higgins D (1997) Multiple sequence alignment. In: Genetic databases. Elsevier, pp 165–183
Zurück zum Zitat Howard RA (1966) Dynamic programming. Manag Sci 12(5):317–348 Howard RA (1966) Dynamic programming. Manag Sci 12(5):317–348
Zurück zum Zitat Hung JH, Weng Z (2016) Sequence alignment and homology search with blast and clustalw. Cold Spring Harb Protoc 2016(11):pdb–prot093088 Hung JH, Weng Z (2016) Sequence alignment and homology search with blast and clustalw. Cold Spring Harb Protoc 2016(11):pdb–prot093088
Zurück zum Zitat Hung YS, Chen KLB, Yang CT, Deng GF (2013) Web usage mining for analysing elder self-care behavior patterns. Expert Syst Appl 40(2):775–783CrossRef Hung YS, Chen KLB, Yang CT, Deng GF (2013) Web usage mining for analysing elder self-care behavior patterns. Expert Syst Appl 40(2):775–783CrossRef
Zurück zum Zitat Kondrak G (2005) N-gram similarity and distance. In: International symposium on string processing and information retrieval. Springer, pp 115–126 Kondrak G (2005) N-gram similarity and distance. In: International symposium on string processing and information retrieval. Springer, pp 115–126
Zurück zum Zitat Li C (2009) Research on web session clustering. J Softw 4(5):460–468CrossRef Li C (2009) Research on web session clustering. J Softw 4(5):460–468CrossRef
Zurück zum Zitat Li C, Lu Y (2007) Similarity measurement of web sessions by sequence alignment. In: IFIP international conference on network and parallel computing workshops, NPC Workshops. IEEE, pp 716–720 Li C, Lu Y (2007) Similarity measurement of web sessions by sequence alignment. In: IFIP international conference on network and parallel computing workshops, NPC Workshops. IEEE, pp 716–720
Zurück zum Zitat Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11(5):473–483CrossRef Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11(5):473–483CrossRef
Zurück zum Zitat Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: International conference on data mining. IEEE, pp 911–916 Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: International conference on data mining. IEEE, pp 911–916
Zurück zum Zitat Liu Z, Wang Y, Dontcheva M, Hoffman M, Walker S, Wilson A (2017) Patterns and sequences: Interactive exploration of clickstreams to understand common visitor paths. IEEE Trans Vis Comput Graph 23(1):321–330CrossRef Liu Z, Wang Y, Dontcheva M, Hoffman M, Walker S, Wilson A (2017) Patterns and sequences: Interactive exploration of clickstreams to understand common visitor paths. IEEE Trans Vis Comput Graph 23(1):321–330CrossRef
Zurück zum Zitat Lopes P, Roy B (2015) Dynamic recommendation system using web usage mining for e-commerce users. Procedia Comput Sci 45:60–69CrossRef Lopes P, Roy B (2015) Dynamic recommendation system using web usage mining for e-commerce users. Procedia Comput Sci 45:60–69CrossRef
Zurück zum Zitat Lu L, Dunham M, Meng Y (2005) Discovery of significant usage patterns from clusters of clickstream data. In: Proceedings of WebKDD. Citeseer, pp 21–24 Lu L, Dunham M, Meng Y (2005) Discovery of significant usage patterns from clusters of clickstream data. In: Proceedings of WebKDD. Citeseer, pp 21–24
Zurück zum Zitat Luu VT, Forestier G, Fondement F, Muller PA (2015) Web site audience segmentation using hybrid alignment techniques. In: Trends and applications in knowledge discovery and data mining. Springer, pp 29–40 Luu VT, Forestier G, Fondement F, Muller PA (2015) Web site audience segmentation using hybrid alignment techniques. In: Trends and applications in knowledge discovery and data mining. Springer, pp 29–40
Zurück zum Zitat Luu VT, Forestier G, Ripken M, Fondement F, Muller PA (2016a) Web usage prediction and recommendation using web session clustering. In: Eleventh international conference on digital information management (ICDIM). IEEE, pp 107–113 Luu VT, Forestier G, Ripken M, Fondement F, Muller PA (2016a) Web usage prediction and recommendation using web session clustering. In: Eleventh international conference on digital information management (ICDIM). IEEE, pp 107–113
Zurück zum Zitat Luu VT, Ripken M, Forestier G, Fondement F, Muller PA (2016b) Using glocal event alignment for comparing sequences of significantly different lengths. In: Machine learning and data mining in pattern recognition. Springer, pp 58–72 Luu VT, Ripken M, Forestier G, Fondement F, Muller PA (2016b) Using glocal event alignment for comparing sequences of significantly different lengths. In: Machine learning and data mining in pattern recognition. Springer, pp 58–72
Zurück zum Zitat Maetschke SR, Kassahn KS, Dunn JA, Han SP, Curley EZ, Stacey KJ, Ragan MA (2010) A visual framework for sequence analysis using n-grams and spectral rearrangement. Bioinformatics 26(6):737–744CrossRef Maetschke SR, Kassahn KS, Dunn JA, Han SP, Curley EZ, Stacey KJ, Ragan MA (2010) A visual framework for sequence analysis using n-grams and spectral rearrangement. Bioinformatics 26(6):737–744CrossRef
Zurück zum Zitat Maleki S, Musuvathi M, Mytkowicz T (2016) Efficient parallelization using rank convergence in dynamic programming algorithms. Commun ACM 59(10):85–92CrossRef Maleki S, Musuvathi M, Mytkowicz T (2016) Efficient parallelization using rank convergence in dynamic programming algorithms. Commun ACM 59(10):85–92CrossRef
Zurück zum Zitat Malik ZK, Fyfe C (2012) Review of web personalization. J Emerg Technol Web Intell 4(3):285–296 Malik ZK, Fyfe C (2012) Review of web personalization. J Emerg Technol Web Intell 4(3):285–296
Zurück zum Zitat Mandal OP, Azad HK (2014) Web access prediction model using clustering and artificial neural network. Int J Eng Res Technol 3 Mandal OP, Azad HK (2014) Web access prediction model using clustering and artificial neural network. Int J Eng Res Technol 3
Zurück zum Zitat Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21(4):441–458CrossRef Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21(4):441–458CrossRef
Zurück zum Zitat Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. Intell Data Anal 18(2):137–156CrossRef Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. Intell Data Anal 18(2):137–156CrossRef
Zurück zum Zitat Nakamura A, Kudo M (2011) Packing alignment: alignment for sequences of various length events. In: Advances in knowledge discovery and data mining. Springer, pp 234–245 Nakamura A, Kudo M (2011) Packing alignment: alignment for sequences of various length events. In: Advances in knowledge discovery and data mining. Springer, pp 234–245
Zurück zum Zitat Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453CrossRef Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453CrossRef
Zurück zum Zitat Neelima G, Rodda S (2016) Predicting user behavior through sessions using the web log mining. In: International conference on advances in human machine interaction (HMI). IEEE, pp 1–5 Neelima G, Rodda S (2016) Predicting user behavior through sessions using the web log mining. In: International conference on advances in human machine interaction (HMI). IEEE, pp 1–5
Zurück zum Zitat Pandi M, Kashefi O, Minaei B et al (2011) A novel similarity measure for sequence data. J Inf Process Syst 7(3):413–424CrossRef Pandi M, Kashefi O, Minaei B et al (2011) A novel similarity measure for sequence data. J Inf Process Syst 7(3):413–424CrossRef
Zurück zum Zitat Petitjean F, Forestier G, Webb G, Nicholson AE, Chen Y, Keogh E, et al (2014) Dynamic time warping averaging of time series allows faster and more accurate classification. In: International conference on data mining. IEEE, pp 470–479 Petitjean F, Forestier G, Webb G, Nicholson AE, Chen Y, Keogh E, et al (2014) Dynamic time warping averaging of time series allows faster and more accurate classification. In: International conference on data mining. IEEE, pp 470–479
Zurück zum Zitat Pinkham J (2010) Method of tracking and targeting internet payloads based on time spent actively viewing. US Patent App. 12/393,546 Pinkham J (2010) Method of tracking and targeting internet payloads based on time spent actively viewing. US Patent App. 12/393,546
Zurück zum Zitat Poornalatha G, Prakash SR (2013) Web sessions clustering using hybrid sequence alignment measure (HSAM). Soc Netw Anal Min 3(2):257–268CrossRef Poornalatha G, Prakash SR (2013) Web sessions clustering using hybrid sequence alignment measure (HSAM). Soc Netw Anal Min 3(2):257–268CrossRef
Zurück zum Zitat Poornalatha G, Raghavendra P (2011a) Alignment based similarity distance measure for better web sessions clustering. Procedia Comput Sci 5:450–457CrossRef Poornalatha G, Raghavendra P (2011a) Alignment based similarity distance measure for better web sessions clustering. Procedia Comput Sci 5:450–457CrossRef
Zurück zum Zitat Poornalatha G, Raghavendra PS (2011b) Web user session clustering using modified k-means algorithm. In: Advances in computing and communications. Springer, pp 243–252 Poornalatha G, Raghavendra PS (2011b) Web user session clustering using modified k-means algorithm. In: Advances in computing and communications. Springer, pp 243–252
Zurück zum Zitat Pramanik S, Setua S (2017) An opposition based differential evolution to solve multiple sequence alignment. In: International conference on computational intelligence, communications, and business analytics. Springer, pp 440–450 Pramanik S, Setua S (2017) An opposition based differential evolution to solve multiple sequence alignment. In: International conference on computational intelligence, communications, and business analytics. Springer, pp 440–450
Zurück zum Zitat Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and PC devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12CrossRef Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and PC devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12CrossRef
Zurück zum Zitat Rendón E, Abundez I, Arizmendi A, Quiroz E (2011) Internal versus external cluster validation indexes. Int J Comput Commun 5(1):27–34 Rendón E, Abundez I, Arizmendi A, Quiroz E (2011) Internal versus external cluster validation indexes. Int J Comput Commun 5(1):27–34
Zurück zum Zitat Rosenberg MS (2009) Sequence alignment: methods, models, concepts, and strategies. University of California Press, Berkeley Rosenberg MS (2009) Sequence alignment: methods, models, concepts, and strategies. University of California Press, Berkeley
Zurück zum Zitat Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5(1):354–362 Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5(1):354–362
Zurück zum Zitat Si J, Li Q, Qian T, Deng X (2012) Discovering \(k\) web user groups with specific aspect interests. In: Machine learning and data mining in pattern recognition. Springer, pp 321–335 Si J, Li Q, Qian T, Deng X (2012) Discovering \(k\) web user groups with specific aspect interests. In: Machine learning and data mining in pattern recognition. Springer, pp 321–335
Zurück zum Zitat Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197CrossRef Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197CrossRef
Zurück zum Zitat Sonnhammer EL, Durbin R (1995) A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167(1):GC1–GC10 Sonnhammer EL, Durbin R (1995) A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167(1):GC1–GC10
Zurück zum Zitat Taly JF, Magis C, Bussotti G, Chang JM, Di Tommaso P, Erb I, Espinosa-Carrasco J, Kemena C, Notredame C (2011) Using the t-coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures. Nat Protoc 6(11):1669CrossRef Taly JF, Magis C, Bussotti G, Chang JM, Di Tommaso P, Erb I, Espinosa-Carrasco J, Kemena C, Notredame C (2011) Using the t-coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures. Nat Protoc 6(11):1669CrossRef
Zurück zum Zitat Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM international conference on data mining. SIAM, pp 225–233 Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM international conference on data mining. SIAM, pp 225–233
Zurück zum Zitat Ting IH, Clark L, Kimble C (2009) Identifying web navigation behaviour and patterns automatically from clickstream data. Int J Web Eng Technol 5(4):398–426CrossRef Ting IH, Clark L, Kimble C (2009) Identifying web navigation behaviour and patterns automatically from clickstream data. Int J Web Eng Technol 5(4):398–426CrossRef
Zurück zum Zitat Tong JC (2013) Blocks substitution matrix (BLOSUM). In: Encyclopedia of systems biology. Springer, pp 152–152 Tong JC (2013) Blocks substitution matrix (BLOSUM). In: Encyclopedia of systems biology. Springer, pp 152–152
Zurück zum Zitat Vorontsov IE, Kulakovskiy IV, Makeev VJ (2013) Jaccard index based similarity measure to compare transcription factor binding site models. Algorithms Mol Biol 8(1):1CrossRef Vorontsov IE, Kulakovskiy IV, Makeev VJ (2013) Jaccard index based similarity measure to compare transcription factor binding site models. Algorithms Mol Biol 8(1):1CrossRef
Zurück zum Zitat Wagh R, Patil J (2017) Enhanced web personalization for improved browsing experience. Adv Comput Sci Technol 10(6):1953–1968 Wagh R, Patil J (2017) Enhanced web personalization for improved browsing experience. Adv Comput Sci Technol 10(6):1953–1968
Zurück zum Zitat Wang W, Zaïane OR (2002) Clustering web sessions by sequence alignment. In: Proceedings of 13th international workshop on database and expert systems applications. IEEE, pp 394–398 Wang W, Zaïane OR (2002) Clustering web sessions by sequence alignment. In: Proceedings of 13th international workshop on database and expert systems applications. IEEE, pp 394–398
Zurück zum Zitat Wang XD, Liu JX, Xu Y, Zhang J (2015) A survey of multiple sequence alignment techniques. In: International conference on intelligent computing. Springer, pp 529–538 Wang XD, Liu JX, Xu Y, Zhang J (2015) A survey of multiple sequence alignment techniques. In: International conference on intelligent computing. Springer, pp 529–538
Zurück zum Zitat Wang G, Zhang X, Tang S, Zheng H, Zhao BY (2016) Unsupervised clickstream clustering for user behavior analysis. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 225–236 Wang G, Zhang X, Tang S, Zheng H, Zhao BY (2016) Unsupervised clickstream clustering for user behavior analysis. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 225–236
Zurück zum Zitat Yan R, Xu D, Yang J, Walker S, Zhang Y (2013) A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep 3:2619CrossRef Yan R, Xu D, Yang J, Walker S, Zhang Y (2013) A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep 3:2619CrossRef
Zurück zum Zitat Yang J, Huang H, Jin X (2017) Mining web access sequence with improved apriori algorithm. In: IEEE international conference on computational science and engineering (CSE) and embedded and ubiquitous computing (EUC), vol 1. IEEE, pp 780–784 Yang J, Huang H, Jin X (2017) Mining web access sequence with improved apriori algorithm. In: IEEE international conference on computational science and engineering (CSE) and embedded and ubiquitous computing (EUC), vol 1. IEEE, pp 780–784
Zurück zum Zitat Yilmaz H, Senkul P (2010) Using ontology and sequence information for extracting behavior patterns from web navigation logs. In: IEEE international conference on data mining workshops (ICDMW). IEEE, pp 549–556 Yilmaz H, Senkul P (2010) Using ontology and sequence information for extracting behavior patterns from web navigation logs. In: IEEE international conference on data mining workshops (ICDMW). IEEE, pp 549–556
Zurück zum Zitat Zahid SK, Hasan L, Khan AA, Ullah S (2015) A novel structure of the smith-waterman algorithm for efficient sequence alignment. In: International conference on digital information, networking, and wireless communications (DINWC). IEEE, pp 6–9 Zahid SK, Hasan L, Khan AA, Ullah S (2015) A novel structure of the smith-waterman algorithm for efficient sequence alignment. In: International conference on digital information, networking, and wireless communications (DINWC). IEEE, pp 6–9
Metadaten
Titel
A review of alignment based similarity measures for web usage mining
verfasst von
Vinh-Trung Luu
Germain Forestier
Jonathan Weber
Paul Bourgeois
Fahima Djelil
Pierre-Alain Muller
Publikationsdatum
28.05.2019
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 3/2020
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-019-09712-9

Weitere Artikel der Ausgabe 3/2020

Artificial Intelligence Review 3/2020 Zur Ausgabe