Skip to main content

2015 | OriginalPaper | Buchkapitel

A Machine Learning Approach to Cluster the Users of Stack Overflow Forum

verfasst von : J. Anusha, V. Smrithi Rekha, P. Bagavathi Sivakumar

Erschienen in: Artificial Intelligence and Evolutionary Algorithms in Engineering Systems

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Online question and answer (Q&A) forums are emerging as excellent learning platforms for learners with varied interests. In this paper, we present our results on the clustering of Stack Overflow users into four clusters, namely naive users, surpassing users, experts, and outshiners. This clustering is based on various metrics available on the forum. We use the X-means and expectation maximization clustering algorithms and compare the results. The results have been validated using internal, external, and relative validation techniques. The objective of this clustering is to be able to trace and predict the activity of a user on this forum. According to our results, majority of users (71 % of 40,000 users under consideration) fall in the ‘experts’ category. This indicates that the users in Stack Overflow are of high quality thereby making the forum an excellent platform for users to learn about computer programming.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat P. Morrison, E. Murphy-Hill, Is programming knowledge related to age? an exploration of stack overflow (MSR, San Francisco, CA, USA 2013) P. Morrison, E. Murphy-Hill, Is programming knowledge related to age? an exploration of stack overflow (MSR, San Francisco, CA, USA 2013)
3.
Zurück zum Zitat M. Allamanis, C. Sutton, Why, when, and what: analyzing stack overflow questions by topic, type, and code (MSR, San Francisco, CA, USA 2013) M. Allamanis, C. Sutton, Why, when, and what: analyzing stack overflow questions by topic, type, and code (MSR, San Francisco, CA, USA 2013)
4.
Zurück zum Zitat M. Asaduzzaman, A.S. Mashiyaty, C.K. Roy, K.A. Schneider, Answering Questions about Unanswered Questions of Stack Overflow (2013) M. Asaduzzaman, A.S. Mashiyaty, C.K. Roy, K.A. Schneider, Answering Questions about Unanswered Questions of Stack Overflow (2013)
5.
Zurück zum Zitat C. Treude, O. Barzilay, M.-A. Storey, How do programmers ask and answer questions on the web? (NIER Track). ICSE 11. (2011) C. Treude, O. Barzilay, M.-A. Storey, How do programmers ask and answer questions on the web? (NIER Track). ICSE 11. (2011)
6.
Zurück zum Zitat D. Correa, A. Sureka, Fit or unfit: analysis and prediction of closed questions on stack overflow (2013) D. Correa, A. Sureka, Fit or unfit: analysis and prediction of closed questions on stack overflow (2013)
7.
Zurück zum Zitat D. Pelleg, A. Moore, X-Means: extending k-means with efficient estimation of the number of clusters, ICML '00 in Proceedings of the Seventeenth International Conference on Machine Learning, pp. 727–734 (2000) D. Pelleg, A. Moore, X-Means: extending k-means with efficient estimation of the number of clusters, ICML '00 in Proceedings of the Seventeenth International Conference on Machine Learning, pp. 727–734 (2000)
8.
Zurück zum Zitat B. Chaudhari, M. Parikh, A comparative study of clustering algorithms using weka tools. Int. J. Appl. Innovation Eng Manage (IJAIEM). 1(2) (2012) B. Chaudhari, M. Parikh, A comparative study of clustering algorithms using weka tools. Int. J. Appl. Innovation Eng Manage (IJAIEM). 1(2) (2012)
9.
Zurück zum Zitat O.A. Abbas, Comparisons Between Data Clustering Algorithms. Int Arab J Info Technol. 5(3) (2008) O.A. Abbas, Comparisons Between Data Clustering Algorithms. Int Arab J Info Technol. 5(3) (2008)
10.
Zurück zum Zitat O.J. Oyelade, O.O. Oladipupo, I.C Obagbuwa, Application of k-means clustering algorithm for prediction of students academic performance. Int. J. Comput. Sci. Inf. Secur. 7(1) (2010) O.J. Oyelade, O.O. Oladipupo, I.C Obagbuwa, Application of k-means clustering algorithm for prediction of students academic performance. Int. J. Comput. Sci. Inf. Secur. 7(1) (2010)
11.
Zurück zum Zitat R. Mauro, M. De Luca, G. DellAcqua, Using a K-means clustering algorithm to examine patterns of vehicle crashes in before-after analysis. Modern Appl. Sci. 79(10) (2013) R. Mauro, M. De Luca, G. DellAcqua, Using a K-means clustering algorithm to examine patterns of vehicle crashes in before-after analysis. Modern Appl. Sci. 79(10) (2013)
12.
Zurück zum Zitat D. Morrison, I. McLoughlin, A. Hogan, C. Hayes, Evolutionary clustering and analysis of user behavior in online forums. Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. (2012) D. Morrison, I. McLoughlin, A. Hogan, C. Hayes, Evolutionary clustering and analysis of user behavior in online forums. Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. (2012)
13.
Zurück zum Zitat J.-. Wen, J.-Y. Nie, H.-J. Zhang, Query clustering using user logs. ACM Trans. Inf. Syst. 20(1)(2002) J.-. Wen, J.-Y. Nie, H.-J. Zhang, Query clustering using user logs. ACM Trans. Inf. Syst. 20(1)(2002)
14.
Zurück zum Zitat S. Padmavathi, C. Rajalaxmi, K.P. Soman, Texel identification using K-Means clustering method, Adv. Compu. Sci.Eng. Appl. AISC Springer-Verlag Berlin Heidelberg. 167, 285–294 (2012) S. Padmavathi, C. Rajalaxmi, K.P. Soman, Texel identification using K-Means clustering method, Adv. Compu. Sci.Eng. Appl. AISC Springer-Verlag Berlin Heidelberg. 167, 285–294 (2012)
Metadaten
Titel
A Machine Learning Approach to Cluster the Users of Stack Overflow Forum
verfasst von
J. Anusha
V. Smrithi Rekha
P. Bagavathi Sivakumar
Copyright-Jahr
2015
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2135-7_44

Premium Partner