Skip to main content
Top
Published in: Social Network Analysis and Mining 2/2013

01-06-2013 | Original Article

Web sessions clustering using hybrid sequence alignment measure (HSAM)

Authors: G. Poornalatha, S. Raghavendra Prakash

Published in: Social Network Analysis and Mining | Issue 2/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Web usage mining inspects the navigation patterns in web access logs and extracts previously unknown and useful information. This may lead to strategies for various web-oriented applications like web site restructure, recommender system, web page prediction and so on. The current work demonstrates clustering of user sessions of uneven lengths to discover the access patterns by proposing a distance method to group user sessions. The proposed hybrid distance measure uses the access path information to find the distance between any two sessions without altering the order in which web pages are visited. R 2 is used to make a decision regarding the number of clusters to be constructed. Jaccard Index and Davies–Bouldin validity index are employed to assess the clustering done. The results obtained by these two standard statistic measures are encouraging and illustrate the goodness of the clusters created.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
Literature
go back to reference Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1:173–185. doi:10.1007/s13278-010-0015-3 Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1:173–185. doi:10.​1007/​s13278-010-0015-3
go back to reference Brudno M, Malde S, Do ACB, Courancne O, Dubchak I, Batzogiou S (2003) Glocal alignment: finding rearrangements during alignent. J Bioinform 19:i54–i63CrossRef Brudno M, Malde S, Do ACB, Courancne O, Dubchak I, Batzogiou S (2003) Glocal alignment: finding rearrangements during alignent. J Bioinform 19:i54–i63CrossRef
go back to reference Chaofeng L, Yansheng L (2007) Similarity measurement of web sessions based on sequence alignment. Wuhan Univ J Nat Sci 12(5):814–818CrossRef Chaofeng L, Yansheng L (2007) Similarity measurement of web sessions based on sequence alignment. Wuhan Univ J Nat Sci 12(5):814–818CrossRef
go back to reference Cooley R, Mobasher B, Srivastava J (1997a) Grouping web page references into transactions for mining World Wide Web browsing patterns. In: Proceedings of the IEEE knowledge and data engineering exchange workshop (KDEX-97), pp 2–9 Cooley R, Mobasher B, Srivastava J (1997a) Grouping web page references into transactions for mining World Wide Web browsing patterns. In: Proceedings of the IEEE knowledge and data engineering exchange workshop (KDEX-97), pp 2–9
go back to reference Cooley R, Mobasher B, Srivastava J (1997b) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of ninth IEEE international conference on tools with artificial intelligence (ICTAI’97), pp 558–567 Cooley R, Mobasher B, Srivastava J (1997b) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of ninth IEEE international conference on tools with artificial intelligence (ICTAI’97), pp 558–567
go back to reference Facca FM, Lanzi PL (2005) Mining interesting knowledge from weblogs: a survey. J Data Knowl Eng 53:225–241CrossRef Facca FM, Lanzi PL (2005) Mining interesting knowledge from weblogs: a survey. J Data Knowl Eng 53:225–241CrossRef
go back to reference Fu Y, Sandhu K, Shih M-Y (1999) Clustering of web users based on access patterns. In: KDD workshop on web mining Fu Y, Sandhu K, Shih M-Y (1999) Clustering of web users based on access patterns. In: KDD workshop on web mining
go back to reference Gunduz S, Tamer Ozsu M (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining Gunduz S, Tamer Ozsu M (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining
go back to reference Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. J Knowl Inform Syst 6:150–163 Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. J Knowl Inform Syst 6:150–163
go back to reference Hofgesang PI (2006) Relevance of time spent on web page. In: Proceedings of WEBKDD’06. ACM, New York Hofgesang PI (2006) Relevance of time spent on web page. In: Proceedings of WEBKDD’06. ACM, New York
go back to reference Khasawneh N, Chan C-C (2008) Multidimensional sessions comparison method using dynamic programming. IEEE, pp 581–585 Khasawneh N, Chan C-C (2008) Multidimensional sessions comparison method using dynamic programming. IEEE, pp 581–585
go back to reference Krol D, Scigajlo M, Trawinski B (2008) Investigation of Internet system user behavior using cluster analysis. In: Proceedings of the seventh international conference on machine learning and cybernetics. IEEE, pp 3408–3412 Krol D, Scigajlo M, Trawinski B (2008) Investigation of Internet system user behavior using cluster analysis. In: Proceedings of the seventh international conference on machine learning and cybernetics. IEEE, pp 3408–3412
go back to reference Li C (2008) Algorithm of web session clustering based on increase of similarities. In: Proceedings of international conference on information management, innovation management and industrial engineering. IEEE, pp 316–319 Li C (2008) Algorithm of web session clustering based on increase of similarities. In: Proceedings of international conference on information management, innovation management and industrial engineering. IEEE, pp 316–319
go back to reference Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of Internal Clustering Validation Measures. In:Proceedings of the 2010 IEEE International Conference on Data Mining, IEEE Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of Internal Clustering Validation Measures. In:Proceedings of the 2010 IEEE International Conference on Data Mining, IEEE
go back to reference Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453CrossRef Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453CrossRef
go back to reference Poornalatha G, Raghavendra PS (2011a) Web user session clustering using modified K-means algorithm. In: Proceedings of ACC-2011, part II, CCIS 191. Springer, Berlin, pp 243–252. doi:10.1007/978-3-642-22714-1_26 Poornalatha G, Raghavendra PS (2011a) Web user session clustering using modified K-means algorithm. In: Proceedings of ACC-2011, part II, CCIS 191. Springer, Berlin, pp 243–252. doi:10.​1007/​978-3-642-22714-1_​26
go back to reference Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5:1–13 Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5:1–13
go back to reference Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197CrossRef Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197CrossRef
go back to reference Srivastava J, Cooley R, Deshpande M (2000) Web usage mining: discovery and applications of usage patterns from Web data. ACM SIGKDD 1:12–23CrossRef Srivastava J, Cooley R, Deshpande M (2000) Web usage mining: discovery and applications of usage patterns from Web data. ACM SIGKDD 1:12–23CrossRef
go back to reference Tseng VS, Lin KW, Chang J (2008) Prediction of user navigation patterns by mining the temporal web usage evolution. J Soft Comput 12(2):157–163CrossRef Tseng VS, Lin KW, Chang J (2008) Prediction of user navigation patterns by mining the temporal web usage evolution. J Soft Comput 12(2):157–163CrossRef
go back to reference Umapathi C, Raja J (2008) Discovering frequent patterns and trends by applying web mining technology in web log data. Int J Soft Comput 3(2):99–105 Umapathi C, Raja J (2008) Discovering frequent patterns and trends by applying web mining technology in web log data. Int J Soft Comput 3(2):99–105
go back to reference Xing D, Shen J (2004) Efficient data mining for web navigation patterns. J Inform Softw Technol 46:55–63CrossRef Xing D, Shen J (2004) Efficient data mining for web navigation patterns. J Inform Softw Technol 46:55–63CrossRef
go back to reference Xu J, Liu H (2010) Web user clustering analysis based on KMeans algorithm. In: Proceedings of the international conference on information, networking and automation (ICINA). IEEE, pp v26–v29 Xu J, Liu H (2010) Web user clustering analysis based on KMeans algorithm. In: Proceedings of the international conference on information, networking and automation (ICINA). IEEE, pp v26–v29
Metadata
Title
Web sessions clustering using hybrid sequence alignment measure (HSAM)
Authors
G. Poornalatha
S. Raghavendra Prakash
Publication date
01-06-2013
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 2/2013
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-012-0070-z

Other articles of this Issue 2/2013

Social Network Analysis and Mining 2/2013 Go to the issue

Premium Partner