Skip to main content

2015 | OriginalPaper | Buchkapitel

Mining Web Server Logs for Creating Workload Models

verfasst von : Fredrik Abbors, Dragos Truscan, Tanwir Ahmad

Erschienen in: Software Technologies

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a tool-supported approach where we used data mining techniques for automatically inferring workload models from historical web access log data. The workload models are represented as Probabilistic Timed Automata (PTA) and describe how users interact with the system. Via their stochastic nature, PTAs have more advantages over traditional approaches which simply playback scripted or pre-recorded traces: they are easier to create and maintain and achieve higher coverage of the tested application. The purpose of these models is to mimic real-user behavior as closely as possible when generating load. To show the validity and applicability of our proposed approach, we present a few experiments. The results show, that the workload models automatically derived from web server logs are able to generate similar load with the one applied by real-users on the system and that they can be used as the starting point for performance testing process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Our approach uses IP-addresses for user classification since the UserId is only available for authenticated users and usually not present in the log.
 
Literatur
1.
Zurück zum Zitat Ferrari, D.: On the foundations of artificial workload design. In: Proceedings of the 1984 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 1984, pp. 8–14. ACM, New York (1984) Ferrari, D.: On the foundations of artificial workload design. In: Proceedings of the 1984 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 1984, pp. 8–14. ACM, New York (1984)
2.
Zurück zum Zitat Al-Jaar, R.: Book review: The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation, and modeling by raj jain (John Wiley & Sons). SIGMETRICS Perform. Eval. Rev. 19, 5–11 (1991)CrossRef Al-Jaar, R.: Book review: The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation, and modeling by raj jain (John Wiley & Sons). SIGMETRICS Perform. Eval. Rev. 19, 5–11 (1991)CrossRef
3.
Zurück zum Zitat Richardson, L., Ruby, S.: Restful web services, 1st edn. O’Reilly, Sebastopol (2007) Richardson, L., Ruby, S.: Restful web services, 1st edn. O’Reilly, Sebastopol (2007)
4.
Zurück zum Zitat Abbors, F., Ahmad, T., Truscan, D., Porres, I.: MBPeT: a model-based performance testing tool. In: 2012 Fourth International Conference on Advances in System Testing and Validation Lifecycle (2012) Abbors, F., Ahmad, T., Truscan, D., Porres, I.: MBPeT: a model-based performance testing tool. In: 2012 Fourth International Conference on Advances in System Testing and Validation Lifecycle (2012)
5.
Zurück zum Zitat Rudolf, A., Pirker, R.: E-business testing: user perceptions and performance issues. In: Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS 2000), pp. 315–323. IEEE Computer Society, Washington, DC (2000) Rudolf, A., Pirker, R.: E-business testing: user perceptions and performance issues. In: Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS 2000), pp. 315–323. IEEE Computer Society, Washington, DC (2000)
6.
Zurück zum Zitat Subraya, B.M., Subrahmanya, S.V.: Object driven performance testing in web applications. In: Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS 2000), pp. 17–26. IEEE Computer Society (2000) Subraya, B.M., Subrahmanya, S.V.: Object driven performance testing in web applications. In: Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS 2000), pp. 17–26. IEEE Computer Society (2000)
7.
Zurück zum Zitat Kathuria, A., Jansen, B.J., Hafernik, C.T., Spink, A.: Classifying the user intent of web queries using k-means clustering. In: Internet Research. Number 5, pp. 563–581. Emerald Group Publishing (2010) Kathuria, A., Jansen, B.J., Hafernik, C.T., Spink, A.: Classifying the user intent of web queries using k-means clustering. In: Internet Research. Number 5, pp. 563–581. Emerald Group Publishing (2010)
8.
Zurück zum Zitat Vaarandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the 3rd IEEE Workshop on IP Operations and Management (IPOM 2003), pp. 119–126. IEEE (2003) Vaarandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the 3rd IEEE Workshop on IP Operations and Management (IPOM 2003), pp. 119–126. IEEE (2003)
9.
Zurück zum Zitat Shi, P.: An efficient approach for clustering web access patterns from web logs. International Journal of Advanced Science and Technology 5, 1–14 (2009). SERSC Shi, P.: An efficient approach for clustering web access patterns from web logs. International Journal of Advanced Science and Technology 5, 1–14 (2009). SERSC
10.
Zurück zum Zitat Mannila, H., Toivonen, H., Inkeri Verkamo, A.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1, 259–289 (1997)CrossRef Mannila, H., Toivonen, H., Inkeri Verkamo, A.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1, 259–289 (1997)CrossRef
11.
Zurück zum Zitat Ma, S., Hellerstein, J.L.: Mining partially periodic event patterns with unknown periods. In: Proceedings of the 17th International Conference on Data Engineering, pp. 205–214. IEEE Computer Society, Washington, DC (2001) Ma, S., Hellerstein, J.L.: Mining partially periodic event patterns with unknown periods. In: Proceedings of the 17th International Conference on Data Engineering, pp. 205–214. IEEE Computer Society, Washington, DC (2001)
12.
Zurück zum Zitat Anastasiou, N., Knottenbelt, W.: PEPERCORN: inferring performance models from location tracking data. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054, pp. 169–172. Springer, Heidelberg (2013) CrossRef Anastasiou, N., Knottenbelt, W.: PEPERCORN: inferring performance models from location tracking data. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054, pp. 169–172. Springer, Heidelberg (2013) CrossRef
13.
Zurück zum Zitat Lutteroth, C., Weber, G.: Modeling a realistic workload for performance testing. In: 12th International Conference on Enterprise Distributed Object Computing, pp. 149–158. IEEE Computer Society (2008) Lutteroth, C., Weber, G.: Modeling a realistic workload for performance testing. In: 12th International Conference on Enterprise Distributed Object Computing, pp. 149–158. IEEE Computer Society (2008)
14.
Zurück zum Zitat Petriu, D.C., Shen, H.: Applying the UML Performance Profile: Graph Grammar-based Derivation of LQN Models from UML Specifications, pp. 159–177. Springer-Verlag (2002) Petriu, D.C., Shen, H.: Applying the UML Performance Profile: Graph Grammar-based Derivation of LQN Models from UML Specifications, pp. 159–177. Springer-Verlag (2002)
15.
Zurück zum Zitat Jurdziński, M., Kwiatkowska, M., Norman, G., Trivedi, A.: Concavely-priced probabilistic timed automata. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp. 415–430. Springer, Heidelberg (2009) CrossRef Jurdziński, M., Kwiatkowska, M., Norman, G., Trivedi, A.: Concavely-priced probabilistic timed automata. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp. 415–430. Springer, Heidelberg (2009) CrossRef
16.
Zurück zum Zitat MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. Number 1, pp. 281–297. University of California Press, Berkeley (1967) MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. Number 1, pp. 281–297. University of California Press, Berkeley (1967)
17.
Zurück zum Zitat Arnold, B.: Pareto and generalized pareto distributions. In: Chotikapanich, D. (ed.) Modeling Income Distributions and Lorenz Curves. vol. 5, Economic Studies in Equality, Social Exclusion and Well-Being, pp. 119–145. Springer, New York (2008) Arnold, B.: Pareto and generalized pareto distributions. In: Chotikapanich, D. (ed.) Modeling Income Distributions and Lorenz Curves. vol. 5, Economic Studies in Equality, Social Exclusion and Well-Being, pp. 119–145. Springer, New York (2008)
20.
Zurück zum Zitat Cai, Y., Grundy, J., Hosking, J.: Synthesizing client load models for performance engineering via web crawling. In: Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, ASE 2007, pp. 353–362. ACM (2007) Cai, Y., Grundy, J., Hosking, J.: Synthesizing client load models for performance engineering via web crawling. In: Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, ASE 2007, pp. 353–362. ACM (2007)
Metadaten
Titel
Mining Web Server Logs for Creating Workload Models
verfasst von
Fredrik Abbors
Dragos Truscan
Tanwir Ahmad
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-25579-8_8