Skip to main content
Erschienen in: Social Network Analysis and Mining 2/2013

01.06.2013 | Original Article

Web sessions clustering using hybrid sequence alignment measure (HSAM)

verfasst von: G. Poornalatha, S. Raghavendra Prakash

Erschienen in: Social Network Analysis and Mining | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Web usage mining inspects the navigation patterns in web access logs and extracts previously unknown and useful information. This may lead to strategies for various web-oriented applications like web site restructure, recommender system, web page prediction and so on. The current work demonstrates clustering of user sessions of uneven lengths to discover the access patterns by proposing a distance method to group user sessions. The proposed hybrid distance measure uses the access path information to find the distance between any two sessions without altering the order in which web pages are visited. R 2 is used to make a decision regarding the number of clusters to be constructed. Jaccard Index and Davies–Bouldin validity index are employed to assess the clustering done. The results obtained by these two standard statistic measures are encouraging and illustrate the goodness of the clusters created.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
Zurück zum Zitat Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1:173–185. doi:10.1007/s13278-010-0015-3 Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1:173–185. doi:10.​1007/​s13278-010-0015-3
Zurück zum Zitat Brudno M, Malde S, Do ACB, Courancne O, Dubchak I, Batzogiou S (2003) Glocal alignment: finding rearrangements during alignent. J Bioinform 19:i54–i63CrossRef Brudno M, Malde S, Do ACB, Courancne O, Dubchak I, Batzogiou S (2003) Glocal alignment: finding rearrangements during alignent. J Bioinform 19:i54–i63CrossRef
Zurück zum Zitat Chaofeng L, Yansheng L (2007) Similarity measurement of web sessions based on sequence alignment. Wuhan Univ J Nat Sci 12(5):814–818CrossRef Chaofeng L, Yansheng L (2007) Similarity measurement of web sessions based on sequence alignment. Wuhan Univ J Nat Sci 12(5):814–818CrossRef
Zurück zum Zitat Cooley R, Mobasher B, Srivastava J (1997a) Grouping web page references into transactions for mining World Wide Web browsing patterns. In: Proceedings of the IEEE knowledge and data engineering exchange workshop (KDEX-97), pp 2–9 Cooley R, Mobasher B, Srivastava J (1997a) Grouping web page references into transactions for mining World Wide Web browsing patterns. In: Proceedings of the IEEE knowledge and data engineering exchange workshop (KDEX-97), pp 2–9
Zurück zum Zitat Cooley R, Mobasher B, Srivastava J (1997b) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of ninth IEEE international conference on tools with artificial intelligence (ICTAI’97), pp 558–567 Cooley R, Mobasher B, Srivastava J (1997b) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of ninth IEEE international conference on tools with artificial intelligence (ICTAI’97), pp 558–567
Zurück zum Zitat Facca FM, Lanzi PL (2005) Mining interesting knowledge from weblogs: a survey. J Data Knowl Eng 53:225–241CrossRef Facca FM, Lanzi PL (2005) Mining interesting knowledge from weblogs: a survey. J Data Knowl Eng 53:225–241CrossRef
Zurück zum Zitat Fu Y, Sandhu K, Shih M-Y (1999) Clustering of web users based on access patterns. In: KDD workshop on web mining Fu Y, Sandhu K, Shih M-Y (1999) Clustering of web users based on access patterns. In: KDD workshop on web mining
Zurück zum Zitat Gunduz S, Tamer Ozsu M (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining Gunduz S, Tamer Ozsu M (2003) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining
Zurück zum Zitat Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. J Knowl Inform Syst 6:150–163 Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. J Knowl Inform Syst 6:150–163
Zurück zum Zitat Hofgesang PI (2006) Relevance of time spent on web page. In: Proceedings of WEBKDD’06. ACM, New York Hofgesang PI (2006) Relevance of time spent on web page. In: Proceedings of WEBKDD’06. ACM, New York
Zurück zum Zitat Jin Y, Lin C, Matsuo Y, Ishizuka M (2012) Mining dynamic social networks from public news articles for company value prediction. Soc Netw Anal Min. doi:10.1007/s13278-011-0045-5 Jin Y, Lin C, Matsuo Y, Ishizuka M (2012) Mining dynamic social networks from public news articles for company value prediction. Soc Netw Anal Min. doi:10.​1007/​s13278-011-0045-5
Zurück zum Zitat Khasawneh N, Chan C-C (2008) Multidimensional sessions comparison method using dynamic programming. IEEE, pp 581–585 Khasawneh N, Chan C-C (2008) Multidimensional sessions comparison method using dynamic programming. IEEE, pp 581–585
Zurück zum Zitat Krol D, Scigajlo M, Trawinski B (2008) Investigation of Internet system user behavior using cluster analysis. In: Proceedings of the seventh international conference on machine learning and cybernetics. IEEE, pp 3408–3412 Krol D, Scigajlo M, Trawinski B (2008) Investigation of Internet system user behavior using cluster analysis. In: Proceedings of the seventh international conference on machine learning and cybernetics. IEEE, pp 3408–3412
Zurück zum Zitat Li C (2008) Algorithm of web session clustering based on increase of similarities. In: Proceedings of international conference on information management, innovation management and industrial engineering. IEEE, pp 316–319 Li C (2008) Algorithm of web session clustering based on increase of similarities. In: Proceedings of international conference on information management, innovation management and industrial engineering. IEEE, pp 316–319
Zurück zum Zitat Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of Internal Clustering Validation Measures. In:Proceedings of the 2010 IEEE International Conference on Data Mining, IEEE Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of Internal Clustering Validation Measures. In:Proceedings of the 2010 IEEE International Conference on Data Mining, IEEE
Zurück zum Zitat Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453CrossRef Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453CrossRef
Zurück zum Zitat Poornalatha G, Raghavendra PS (2011a) Web user session clustering using modified K-means algorithm. In: Proceedings of ACC-2011, part II, CCIS 191. Springer, Berlin, pp 243–252. doi:10.1007/978-3-642-22714-1_26 Poornalatha G, Raghavendra PS (2011a) Web user session clustering using modified K-means algorithm. In: Proceedings of ACC-2011, part II, CCIS 191. Springer, Berlin, pp 243–252. doi:10.​1007/​978-3-642-22714-1_​26
Zurück zum Zitat Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5:1–13 Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5:1–13
Zurück zum Zitat Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197CrossRef Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197CrossRef
Zurück zum Zitat Srivastava J, Cooley R, Deshpande M (2000) Web usage mining: discovery and applications of usage patterns from Web data. ACM SIGKDD 1:12–23CrossRef Srivastava J, Cooley R, Deshpande M (2000) Web usage mining: discovery and applications of usage patterns from Web data. ACM SIGKDD 1:12–23CrossRef
Zurück zum Zitat Tseng VS, Lin KW, Chang J (2008) Prediction of user navigation patterns by mining the temporal web usage evolution. J Soft Comput 12(2):157–163CrossRef Tseng VS, Lin KW, Chang J (2008) Prediction of user navigation patterns by mining the temporal web usage evolution. J Soft Comput 12(2):157–163CrossRef
Zurück zum Zitat Umapathi C, Raja J (2008) Discovering frequent patterns and trends by applying web mining technology in web log data. Int J Soft Comput 3(2):99–105 Umapathi C, Raja J (2008) Discovering frequent patterns and trends by applying web mining technology in web log data. Int J Soft Comput 3(2):99–105
Zurück zum Zitat Xing D, Shen J (2004) Efficient data mining for web navigation patterns. J Inform Softw Technol 46:55–63CrossRef Xing D, Shen J (2004) Efficient data mining for web navigation patterns. J Inform Softw Technol 46:55–63CrossRef
Zurück zum Zitat Xu J, Liu H (2010) Web user clustering analysis based on KMeans algorithm. In: Proceedings of the international conference on information, networking and automation (ICINA). IEEE, pp v26–v29 Xu J, Liu H (2010) Web user clustering analysis based on KMeans algorithm. In: Proceedings of the international conference on information, networking and automation (ICINA). IEEE, pp v26–v29
Metadaten
Titel
Web sessions clustering using hybrid sequence alignment measure (HSAM)
verfasst von
G. Poornalatha
S. Raghavendra Prakash
Publikationsdatum
01.06.2013
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 2/2013
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-012-0070-z

Weitere Artikel der Ausgabe 2/2013

Social Network Analysis and Mining 2/2013 Zur Ausgabe