Skip to main content
Top

2023 | OriginalPaper | Chapter

On Ups and Downs in Analyzing Web Activity Data: Notes from a Project

Authors : Jan W. Owsiński, Marek Gajewski, Olgierd Hryniewicz, Agnieszka Jastrzębska, Mariusz Kozakiewicz, Karol Opara, Sławomir Zadrożny, Tomasz Zwierzchowski

Published in: International Symposium on Intelligent Informatics

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Analyzing data from the web is now one of the primary tasks, understood in a variety of manners and solved for a very wide variety of purposes. The talk describes the experience from a project, devoted to analyzing such data while drawing some more general conclusions. The project was aimed at distinguishing artificial ad-related traffic from the genuine one. The rationale is simple: The flow of money depends upon the number of clicks on/views of an ad. If so, fake clicking changes the market to the benefit of some, and to the loss of the other ones. The talk describes the problem and its conceptual framing, as well as a number of technical details, involving the issues and techniques of (1) variable analysis and choice; (2) clustering; (3) classification/classifiers; (4) potential hybrid techniques, along with citations of the most interesting results. These often imply definite general conclusions, some of them quite surprising.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The best illustration is provided by the most recent sanctions against Russia in the context of her aggression against Ukraine: one of the key issues concerned the banking system and the possibility of performing transactions.
 
2
The fact that the categorization is dichotomous or trichotomous does not mean that the problem is simple, see the task of identifying irony or sarcasm in the web-provided expressions.
 
3
In this sequence of steps, we concentrate on the cognitive aspect, but, of course, the business side (costs and benefits expected) has to be normally accounted for on a par.
 
4
Think of various functions, aggregates, statistical representations, etc., of the raw data.
 
5
In the same step: is there any common sense (even if very rough) approach to the problem?
 
6
We definitely believe that science should lead to truth, but it is most often simply, out of necessity, approximated.
 
7
The web users are, as a rule, not aware that while they move to a given web page, supposed to provide the advertising content, their properties (as expressed through, in particular, the “cookies”) guide the flash auction, resulting in the advertising material they will actually see.
 
8
We put apart the crawlers and bots with no “negative” objectives, gathering statistical data, etc.
 
Literature
1.
go back to reference M. Gajewski, O. Hryniewicz, A. Jastrzębska, K. Opara, J.W. Owsiński, S. Zadrożny, M. Kozakiewicz, T. Zwierzchowski: Explainable identification of bots from web activity logs, (2021) (submitted) M. Gajewski, O. Hryniewicz, A. Jastrzębska, K. Opara, J.W. Owsiński, S. Zadrożny, M. Kozakiewicz, T. Zwierzchowski: Explainable identification of bots from web activity logs, (2021) (submitted)
2.
go back to reference M. Gajewski, O. Hryniewicz, A. Jastrzębska, M. Kozakiewicz, K. Opara, J.W. Owsiński, Sł. Zadrożny, T. Zwierzchowski: Assessing the Share of the Artificial Ad-Related Traffic: Some General Observations. Chapter 26 w: C. Ciurea et al. (Eds.) Education, Research and Business Technologies. Smart Innovation, Systems and Technologies 276. Springer Nature Singapore Pte Ltd., (2022) M. Gajewski, O. Hryniewicz, A. Jastrzębska, M. Kozakiewicz, K. Opara, J.W. Owsiński, Sł. Zadrożny, T. Zwierzchowski: Assessing the Share of the Artificial Ad-Related Traffic: Some General Observations. Chapter 26 w: C. Ciurea et al. (Eds.) Education, Research and Business Technologies. Smart Innovation, Systems and Technologies 276. Springer Nature Singapore Pte Ltd., (2022)
4.
go back to reference S. Khattak, N.R. Ramay, K.R. Khan, A.A. Syed, S.A. Khayam, A taxonomy of botnet behavior, detection, and defense. IEEE Commun. Surv. & Tutor. 16(2), 898–924 (2014)CrossRef S. Khattak, N.R. Ramay, K.R. Khan, A.A. Syed, S.A. Khayam, A taxonomy of botnet behavior, detection, and defense. IEEE Commun. Surv. & Tutor. 16(2), 898–924 (2014)CrossRef
6.
go back to reference I. Aberathne, C. Walgampaya Smart mobile bot detection through behavioral analysis, in Advances in Data and Information Sciences. Springer, (2018) pp. 241−252 I. Aberathne, C. Walgampaya Smart mobile bot detection through behavioral analysis, in Advances in Data and Information Sciences. Springer, (2018) pp. 241−252
8.
go back to reference M. Gagolewski, M. Bartoszuk, A. Cena, Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm. Inf. Sci. 363, 8–23 (2016)CrossRef M. Gagolewski, M. Bartoszuk, A. Cena, Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm. Inf. Sci. 363, 8–23 (2016)CrossRef
9.
go back to reference M. Ester, H.-P. Kriegel, J. Sander, X.-w. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise. In: E. Simoudis, J.-w. Han, U. M. Fayyad (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press, 226–231 (1996) M. Ester, H.-P. Kriegel, J. Sander, X.-w. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise. In: E. Simoudis, J.-w. Han, U. M. Fayyad (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press, 226–231 (1996)
12.
go back to reference M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques. J. Intell. Inf. Syst. 171(2–3), 107–145 (2001)CrossRefMATH M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques. J. Intell. Inf. Syst. 171(2–3), 107–145 (2001)CrossRefMATH
13.
go back to reference K. Kryszczuk, P. Hurley Estimation of the number of clusters using multiple clustering validity indices, in Multiple Classifier Systems. 2010. Lecture Notes in Computer Science. Springer: Cham. 5997: 114–123 K. Kryszczuk, P. Hurley Estimation of the number of clusters using multiple clustering validity indices, in Multiple Classifier Systems. 2010. Lecture Notes in Computer Science. Springer: Cham. 5997: 114–123
14.
go back to reference H.M. Sani, C. Lei, D. Neagu. Computational complexity analysis of decision tree algorithms. in M. Bramer, M Petridis. (eds.) Artificial Intelligence XXXV. SGAI 2018. Lecture Notes in Computer Science. Springer: Cham. 11311: 191–197 H.M. Sani, C. Lei, D. Neagu. Computational complexity analysis of decision tree algorithms. in M. Bramer, M Petridis. (eds.) Artificial Intelligence XXXV. SGAI 2018. Lecture Notes in Computer Science. Springer: Cham. 11311: 191–197
Metadata
Title
On Ups and Downs in Analyzing Web Activity Data: Notes from a Project
Authors
Jan W. Owsiński
Marek Gajewski
Olgierd Hryniewicz
Agnieszka Jastrzębska
Mariusz Kozakiewicz
Karol Opara
Sławomir Zadrożny
Tomasz Zwierzchowski
Copyright Year
2023
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-19-8094-7_37