Skip to main content
Top
Published in: Empirical Software Engineering 6/2017

09-04-2017

What do developers search for on the web?

Authors: Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing

Published in: Empirical Software Engineering | Issue 6/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the frequency and difficulty of 34 search tasks which are grouped along the following seven dimensions: general search, debugging and bug fixing, programming, third party code reuse, tools, database, and testing. We find that searching for explanations for unknown terminologies, explanations for exceptions/error messages (e.g., HTTP 404), reusable code snippets, solutions to common programming bugs, and suitable third-party libraries/services are the most frequent search tasks that developers perform, while searching for solutions to performance bugs, solutions to multi-threading bugs, public datasets to test newly developed algorithms or systems, reusable code snippets, best industrial practices, database optimization solutions, solutions to security bugs, and solutions to software configuration bugs are the most difficult search tasks that developers consider. Our study sheds light as to why practitioners often perform some of these tasks and why they find some of them to be challenging. We also discuss the implications of our findings to future research in several research areas, e.g., code search engines, domain-specific search engines, and automated generation and refinement of search queries.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
4
Notice although Google is blocked in China, developers can use Google by using agent service such as Shadowsocks. See https://​shadowsocks.​com/​ for more details.
 
5
CSDN is one of the largest technical blog site in China, see http://​www.​csdn.​net/​ for more details.
 
7
Zhihu is one of the largest Q&A site in China, see http://​www.​zhihu.​com/​ for more details.
 
10
Cloudera is a software company that provides Apache Hadoop-based software, support and services, and training to business customers (https://​www.​cloudera.​com/​).
 
11
We identified another 6 search tasks in the open-ended interviews.
 
14
Wensong Zhang is the co-founder of the project Linux Virtual Server, see https://​en.​wikipedia.​org/​wiki/​Linux_​Virtual_​Server for more details.
 
15
D An expletive was masked out.
 
16
https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers
 
Literature
go back to reference Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search Proceedings of the 21st ACM SIGPLAN symposium on object-oriented programming systems, languages, and applications, ACM, pp 681–682 Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search Proceedings of the 21st ACM SIGPLAN symposium on object-oriented programming systems, languages, and applications, ACM, pp 681–682
go back to reference Bajracharya SK, Lopes CV (2009) Mining search topics from a code search engine usage log Proceedings of the 6th international working conference on mining software repositories (MSR), IEEE Bajracharya SK, Lopes CV (2009) Mining search topics from a code search engine usage log Proceedings of the 6th international working conference on mining software repositories (MSR), IEEE
go back to reference Bajracharya SK, Lopes CV (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17(4-5):424–466CrossRef Bajracharya SK, Lopes CV (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17(4-5):424–466CrossRef
go back to reference Bao L, Xing Z, Wang X, Zhou B (2015a) Tracking and analyzing cross-cutting activities in developers’ daily work Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 277–282 Bao L, Xing Z, Wang X, Zhou B (2015a) Tracking and analyzing cross-cutting activities in developers’ daily work Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 277–282
go back to reference Bao L, Ye D, Xing Z, Xia X, Wang X (2015b) Activityspace: a remembrance framework to support interapplication information needs Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), IEEE, pp 864–869 Bao L, Ye D, Xing Z, Xia X, Wang X (2015b) Activityspace: a remembrance framework to support interapplication information needs Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), IEEE, pp 864–869
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
go back to reference Brandt J, Guo PJ, Lewenstein J, Dontcheva M, Klemmer SR (2009) Two studies of opportunistic programming: interleaving web foraging, learning, and writing code Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 1589–1598 Brandt J, Guo PJ, Lewenstein J, Dontcheva M, Klemmer SR (2009) Two studies of opportunistic programming: interleaving web foraging, learning, and writing code Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 1589–1598
go back to reference Broder A (2002) A taxonomy of web search ACM SIGIR Forum, ACM, vol 36, pp 3–10 Broder A (2002) A taxonomy of web search ACM SIGIR Forum, ACM, vol 36, pp 3–10
go back to reference Cutrell E, Guan Z (2007) What are you looking for?: an eye-tracking study of information usage in web search Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 407–416 Cutrell E, Guan Z (2007) What are you looking for?: an eye-tracking study of information usage in web search Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 407–416
go back to reference Haiduc S, Bavota G, Marcus A, Oliveto R, Lucia AD, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering Proceedings of the 35th international conference on software engineering (ICSE), pp 842–851 Haiduc S, Bavota G, Marcus A, Oliveto R, Lucia AD, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering Proceedings of the 35th international conference on software engineering (ICSE), pp 842–851
go back to reference Jansen BJ, Spink A, Saracevic T (2000) Real life, real users, and real needs: a study and analysis of user queries on the web. Inf Process Manag 36(2):207–227CrossRef Jansen BJ, Spink A, Saracevic T (2000) Real life, real users, and real needs: a study and analysis of user queries on the web. Inf Process Manag 36(2):207–227CrossRef
go back to reference Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng (TSE) 32(12):971–987CrossRef Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng (TSE) 32(12):971–987CrossRef
go back to reference Lee U, Liu Z, Cho J (2005) Automatic identification of user goals in web search Proceedings of the 14th international conference on world wide web (WWW), ACM, pp 391–400 Lee U, Liu Z, Cho J (2005) Automatic identification of user goals in web search Proceedings of the 14th international conference on world wide web (WWW), ACM, pp 391–400
go back to reference Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) Codegenie: using test-cases to search and reuse source code Proceedings of the 22nd IEEE/ACM international conference on automated software engineering (ASE), ACM, pp 525–526 Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) Codegenie: using test-cases to search and reuse source code Proceedings of the 22nd IEEE/ACM international conference on automated software engineering (ASE), ACM, pp 525–526
go back to reference Li H, Xing Z, Peng X, Zhao W (2013) What help do developers seek, when and how? Proceedings of the 20th working conference on reverse engineering (WCRE), IEEE, pp 142–151 Li H, Xing Z, Peng X, Zhao W (2013) What help do developers seek, when and how? Proceedings of the 20th working conference on reverse engineering (WCRE), IEEE, pp 142–151
go back to reference Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Disc 18(2):300–336MathSciNetCrossRef Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Disc 18(2):300–336MathSciNetCrossRef
go back to reference Ponzanelli L, Bacchelli A, Lanza M (2013) Seahawk: Stack overflow in the ide Proceedings of the 2013 international conference on software engineering, IEEE Press, pp 1295–1298 Ponzanelli L, Bacchelli A, Lanza M (2013) Seahawk: Stack overflow in the ide Proceedings of the 2013 international conference on software engineering, IEEE Press, pp 1295–1298
go back to reference Rahman MM, Yeasmin S, Roy CK (2014) Towards a context-aware ide-based meta search engine for recommendation about programming errors and exceptions Software evolution week-IEEE conference on software maintenance, reengineering and reverse engineering (CSMR-WCRE), 2014, IEEE, pp 194–203 Rahman MM, Yeasmin S, Roy CK (2014) Towards a context-aware ide-based meta search engine for recommendation about programming errors and exceptions Software evolution week-IEEE conference on software maintenance, reengineering and reverse engineering (CSMR-WCRE), 2014, IEEE, pp 194–203
go back to reference Rose DE, Levinson D (2004) Understanding user goals in web search Proceedings of the 13th international conference on world wide web (WWW), ACM, pp 13–19 Rose DE, Levinson D (2004) Understanding user goals in web search Proceedings of the 13th international conference on world wide web (WWW), ACM, pp 13–19
go back to reference Sadowski C, Stolee KT, Elbaum S (2015) How developers search for code: a case study Proceedings of the 10th joint meeting on foundations of software engineering (FSE), ACM, pp 191–201 Sadowski C, Stolee KT, Elbaum S (2015) How developers search for code: a case study Proceedings of the 10th joint meeting on foundations of software engineering (FSE), ACM, pp 191–201
go back to reference Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512 Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512
go back to reference Sillito J, Murphy GC, De Volder K (2006) Questions programmers ask during software evolution tasks Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering, ACM, pp 23–34 Sillito J, Murphy GC, De Volder K (2006) Questions programmers ask during software evolution tasks Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering, ACM, pp 23–34
go back to reference Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log ACM SIGIR Forum, ACM, vol 33, pp 6–12 Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log ACM SIGIR Forum, ACM, vol 33, pp 6–12
go back to reference Sim SE, Clarke CL, Holt RC (1998) Archetypal source code searches: a survey of software developers and maintainers Proceedings of the 6th international workshop on program comprehension (IWPC), IEEE, pp 180–187 Sim SE, Clarke CL, Holt RC (1998) Archetypal source code searches: a survey of software developers and maintainers Proceedings of the 6th international workshop on program comprehension (IWPC), IEEE, pp 180–187
go back to reference Sim SE, Umarji M, Ratanotayanon S, Lopes CV (2011) How well do search engines support code retrieval on the web? ACM Trans Softw Eng Methodol (TOSEM) 21(1):4CrossRef Sim SE, Umarji M, Ratanotayanon S, Lopes CV (2011) How well do search engines support code retrieval on the web? ACM Trans Softw Eng Methodol (TOSEM) 21(1):4CrossRef
go back to reference Sim SE, Philip K, Umarji M, Agarwala M, Gallardo-Valencia R, Lopes CV, Ratanotayanon S (2012) Software reuse through methodical component reuse and amethodical snippet remixing Proceedings of the ACM 2012 conference on computer supported cooperative work, ACM, pp 1361–1370 Sim SE, Philip K, Umarji M, Agarwala M, Gallardo-Valencia R, Lopes CV, Ratanotayanon S (2012) Software reuse through methodical component reuse and amethodical snippet remixing Proceedings of the ACM 2012 conference on computer supported cooperative work, ACM, pp 1361–1370
go back to reference Spink A, Jansen BJ, Wolfram D, Saracevic T (2002) From e-sex to e-commerce: Web search changes. Computer 35(3):107–109CrossRef Spink A, Jansen BJ, Wolfram D, Saracevic T (2002) From e-sex to e-commerce: Web search changes. Computer 35(3):107–109CrossRef
go back to reference Stolee KT, Elbaum S, Dobos D (2014) Solving the search for source code. ACM Trans Softw Eng Methodol (TOSEM) 23(3):26CrossRef Stolee KT, Elbaum S, Dobos D (2014) Solving the search for source code. ACM Trans Softw Eng Methodol (TOSEM) 23(3):26CrossRef
go back to reference Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction model. IEEE Trans Softw Eng (TSE) 43(1):1–18 Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction model. IEEE Trans Softw Eng (TSE) 43(1):1–18
go back to reference Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web?: Nier track Proceedings of the 33rd international conference on software engineering (ICSE), IEEE, pp 804–807 Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web?: Nier track Proceedings of the 33rd international conference on software engineering (ICSE), IEEE, pp 804–807
go back to reference Wuensch KL (2005) What is a likert scale? and how do you pronounce’likert?’. East Carolina University Wuensch KL (2005) What is a likert scale? and how do you pronounce’likert?’. East Carolina University
Metadata
Title
What do developers search for on the web?
Authors
Xin Xia
Lingfeng Bao
David Lo
Pavneet Singh Kochhar
Ahmed E. Hassan
Zhenchang Xing
Publication date
09-04-2017
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 6/2017
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-017-9514-4

Other articles of this Issue 6/2017

Empirical Software Engineering 6/2017 Go to the issue

Premium Partner