Skip to main content
Erschienen in: International Journal of Data Science and Analytics 2/2018

05.07.2018 | Regular Paper

A characterization of sample selection bias in system evaluation and the case of information retrieval

verfasst von: Massimo Melucci

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sample selection bias consists of the effects of the procedure for selecting individuals for inclusion in the sample. Bias may affect system evaluation in many respects, since it (i) requires larger samples than those sufficient to estimate the efficiency of one system, (ii) requires much larger samples to rank systems by efficiency and (iii) can penalize some systems. The unbiased measure that is described in this paper awards the systems that poorly perform for difficult tasks, thus providing a better picture both of system efficiency and system ranking. Nevertheless, we found that bias cannot be completely removed when a group of systems is ranked even though it is corrected for each single system. Eventually, further research should be done to find methods that substantially improve retrieval effectiveness for difficult tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
As for an introduction, the reader may refer to [34, 38].
 
Literatur
1.
Zurück zum Zitat Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79, 151–175 (2010)MathSciNetCrossRef Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79, 151–175 (2010)MathSciNetCrossRef
2.
Zurück zum Zitat Berk, R.A.: An introduction to sample selection bias in sociological data. Am. Sociol. Rev. 48(3), 386–398 (1983)CrossRef Berk, R.A.: An introduction to sample selection bias in sociological data. Am. Sociol. Rev. 48(3), 386–398 (1983)CrossRef
3.
Zurück zum Zitat Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning under covariate shift. J. Mach. Learn. Res. 10, 2137–2155 (2009)MathSciNetMATH Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning under covariate shift. J. Mach. Learn. Res. 10, 2137–2155 (2009)MathSciNetMATH
5.
Zurück zum Zitat Buckley, C., Voorhees, E.M.: Evaluating evaluation measure stability. In: Proceedings of SIGIR, pp. 33–40 (2000) Buckley, C., Voorhees, E.M.: Evaluating evaluation measure stability. In: Proceedings of SIGIR, pp. 33–40 (2000)
6.
Zurück zum Zitat Buckley, C., Voorhees, E.M.: Retrieval system evaluation. In: Voorhees, E.M., Harman, D. (eds.) TREC: Experiment and Evaluation in Information Retrieval, Chap. 3. MIT Press, Cambridge (2005) Buckley, C., Voorhees, E.M.: Retrieval system evaluation. In: Voorhees, E.M., Harman, D. (eds.) TREC: Experiment and Evaluation in Information Retrieval, Chap. 3. MIT Press, Cambridge (2005)
7.
Zurück zum Zitat Cortes, C., Mohri, M.: Domain adaptation and sample bias correction theory and algorithm for regression. Theoret. Comput. Sci. 519, 103–126 (2014)MathSciNetCrossRefMATH Cortes, C., Mohri, M.: Domain adaptation and sample bias correction theory and algorithm for regression. Theoret. Comput. Sci. 519, 103–126 (2014)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Cortes, C., Mohri, M., Riley, M., Rostamizadeh, A.: Sample selection bias correction theory. In: Proceedings of ALT, pp. 38–53. Springer (2008) Cortes, C., Mohri, M., Riley, M., Rostamizadeh, A.: Sample selection bias correction theory. In: Proceedings of ALT, pp. 38–53. Springer (2008)
9.
Zurück zum Zitat Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR, pp. 299–306 (2002) Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR, pp. 299–306 (2002)
10.
Zurück zum Zitat Fan, W., Davidson, I.: ReverseTesting: an efficient framework to select amongst classifiers under sample selection bias. In: Proceedings of KDD, pp. 147–156 (2006) Fan, W., Davidson, I.: ReverseTesting: an efficient framework to select amongst classifiers under sample selection bias. In: Proceedings of KDD, pp. 147–156 (2006)
11.
Zurück zum Zitat Gronau, R.: Wage comparisons—a selectivity bias. J. Polit. Econ. 82(6), 1119–1143 (1974)CrossRef Gronau, R.: Wage comparisons—a selectivity bias. J. Polit. Econ. 82(6), 1119–1143 (1974)CrossRef
12.
Zurück zum Zitat Guiver, J., Mizzaro, S., Robertson, S.: A few good topics: experiments in topic set reduction for retrieval evaluation. ACM Trans. Inf. Syst. 27, 1–26 (2009)CrossRef Guiver, J., Mizzaro, S., Robertson, S.: A few good topics: experiments in topic set reduction for retrieval evaluation. ACM Trans. Inf. Syst. 27, 1–26 (2009)CrossRef
13.
Zurück zum Zitat Hagan, J., Parker, P.: White-collar crime and punishment: the class structure and legal sanctioning of securities violations. Am. Sociol. Rev. 50(3), 302–316 (1985)CrossRef Hagan, J., Parker, P.: White-collar crime and punishment: the class structure and legal sanctioning of securities violations. Am. Sociol. Rev. 50(3), 302–316 (1985)CrossRef
14.
Zurück zum Zitat Hauff, C., Murdock, V., Baeza-Yates, R.: Improved query difficulty prediction for the web. In: Proceedings of CIKM, pp. 439–448 (2008) Hauff, C., Murdock, V., Baeza-Yates, R.: Improved query difficulty prediction for the web. In: Proceedings of CIKM, pp. 439–448 (2008)
15.
Zurück zum Zitat Hawking, D., Craswell, N.: Overview of TREC-10 Web track. In: Proceedings of TREC. Department of Commerce, National Institute of Standards and Technology (2002). http://trec.nist.gov/ Hawking, D., Craswell, N.: Overview of TREC-10 Web track. In: Proceedings of TREC. Department of Commerce, National Institute of Standards and Technology (2002). http://​trec.​nist.​gov/​
17.
18.
Zurück zum Zitat Hosseini, M., Cox, I.J., Milic-Frayling, N., Shokouhi, M., Yilmaz, E.: An uncertainty-aware query selection model for evaluation of IR systems. In: Proceedings of SIGIR, pp. 901–910 (2012) Hosseini, M., Cox, I.J., Milic-Frayling, N., Shokouhi, M., Yilmaz, E.: An uncertainty-aware query selection model for evaluation of IR systems. In: Proceedings of SIGIR, pp. 901–910 (2012)
19.
Zurück zum Zitat Hosseini, M., Cox, I.J., Milic-Frayling, N., Sweeting, T., Vinay, V.: Prioritizing relevance judgments to improve the construction of IR test collections. In: Proceedings of CIKM, pp. 641–646 (2011) Hosseini, M., Cox, I.J., Milic-Frayling, N., Sweeting, T., Vinay, V.: Prioritizing relevance judgments to improve the construction of IR test collections. In: Proceedings of CIKM, pp. 641–646 (2011)
20.
Zurück zum Zitat Jarvëlin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRef Jarvëlin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRef
21.
Zurück zum Zitat Kanamori, T., Hido, S., Sugiyama, M.: A least-squares approach to direct importance estimation. J. Mach. Learn. Res. 10, 1391–1445 (2009)MathSciNetMATH Kanamori, T., Hido, S., Sugiyama, M.: A least-squares approach to direct importance estimation. J. Mach. Learn. Res. 10, 1391–1445 (2009)MathSciNetMATH
22.
23.
Zurück zum Zitat Markov, A.A.: The Calculus of Probabilities. Gosizdat, Moscow (1913) Markov, A.A.: The Calculus of Probabilities. Gosizdat, Moscow (1913)
24.
Zurück zum Zitat Melucci, M.: Contextual Search: A Computational Framework. Foundations and Trends in Information Retrieval. Now Publishers, Breda (2012) Melucci, M.: Contextual Search: A Computational Framework. Foundations and Trends in Information Retrieval. Now Publishers, Breda (2012)
25.
Zurück zum Zitat Melucci, M.: Impact of query sample selection bias on information retrieval system ranking. In: Proceedings of IEEE DSAA (2016) Melucci, M.: Impact of query sample selection bias on information retrieval system ranking. In: Proceedings of IEEE DSAA (2016)
26.
Zurück zum Zitat Peterson, R.D., Hagan, J.: Changing conceptions of race: towards an account of anomalous findings of sentencing research. Am. Sociol. Rev. 49(1), 56–70 (1984)CrossRef Peterson, R.D., Hagan, J.: Changing conceptions of race: towards an account of anomalous findings of sentencing research. Am. Sociol. Rev. 49(1), 56–70 (1984)CrossRef
27.
Zurück zum Zitat Read, C.R.: Markov’s inequality. In: Kotz, S., Read, C.B., Balakrishnan, N., Vidakovic, B., Johnson, N.L. (eds.) Encyclopaedia of Statistical Science. Wiley, Hoboken (2004) Read, C.R.: Markov’s inequality. In: Kotz, S., Read, C.B., Balakrishnan, N., Vidakovic, B., Johnson, N.L. (eds.) Encyclopaedia of Statistical Science. Wiley, Hoboken (2004)
28.
Zurück zum Zitat Ren, J., Shi, X., Fan, W., Yu, P.S.: Type-independent correction of sample selection bias via structural discovery and re-balancing. In: Proceedings of ICDM, pp. 565–576 (2008) Ren, J., Shi, X., Fan, W., Yu, P.S.: Type-independent correction of sample selection bias via structural discovery and re-balancing. In: Proceedings of ICDM, pp. 565–576 (2008)
29.
Zurück zum Zitat Seah, C., Tsang, I.W., Ong, Y.: Healing sample selection bias by source classifier selection. In: Proceedings of ICDM, pp. 577–586 (2011) Seah, C., Tsang, I.W., Ong, Y.: Healing sample selection bias by source classifier selection. In: Proceedings of ICDM, pp. 577–586 (2011)
31.
Zurück zum Zitat Sparck Jones, K., van Rijsbergen, C.: Information retrieval test collections. J. Doc. 32(1), 59–75 (1976)CrossRef Sparck Jones, K., van Rijsbergen, C.: Information retrieval test collections. J. Doc. 32(1), 59–75 (1976)CrossRef
32.
Zurück zum Zitat Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904)CrossRef Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904)CrossRef
33.
Zurück zum Zitat Stevens, W.L.: Sampling without replacement with probability proportional to size. J. R. Stat. Soc. Ser. B (Methodol.) 20(2), 393–397 (1958)MathSciNetMATH Stevens, W.L.: Sampling without replacement with probability proportional to size. J. R. Stat. Soc. Ser. B (Methodol.) 20(2), 393–397 (1958)MathSciNetMATH
34.
Zurück zum Zitat van Rijsbergen, C., Sparck Jones, K.: Report on the need for and provision of and “ideal” information retrieval test collection. Tech. Rep. BLRDR 5266, British Library. Cambridge University Computer Laboratory (1976) van Rijsbergen, C., Sparck Jones, K.: Report on the need for and provision of and “ideal” information retrieval test collection. Tech. Rep. BLRDR 5266, British Library. Cambridge University Computer Laboratory (1976)
35.
Zurück zum Zitat Vapnik, V.N.: Statistical Learning Theory. Wiley, Hoboken (1998)MATH Vapnik, V.N.: Statistical Learning Theory. Wiley, Hoboken (1998)MATH
36.
Zurück zum Zitat Vella, F.: Estimating models with sample selection bias: a survey. J. Hum. Resour. XXXII(1), 127–169 (2000)CrossRef Vella, F.: Estimating models with sample selection bias: a survey. J. Hum. Resour. XXXII(1), 127–169 (2000)CrossRef
37.
Zurück zum Zitat Voorhees, E., Buckley, C.: The effect of topic set size on retrieval experiment error. In: Proceedings of SIGIR, pp. 316–323 (2002) Voorhees, E., Buckley, C.: The effect of topic set size on retrieval experiment error. In: Proceedings of SIGIR, pp. 316–323 (2002)
38.
Zurück zum Zitat Voorhees, E., Harman, D. (eds.): TREC: Experiment and Evaluation in Information Retrieval. MIT Press, Cambridge (2005) Voorhees, E., Harman, D. (eds.): TREC: Experiment and Evaluation in Information Retrieval. MIT Press, Cambridge (2005)
39.
Zurück zum Zitat Webber, W., Park, L.A.F.: Score adjustment for correction of pooling bias. In: Proceedings of SIGIR, pp. 444–451 (2009) Webber, W., Park, L.A.F.: Score adjustment for correction of pooling bias. In: Proceedings of SIGIR, pp. 444–451 (2009)
40.
Zurück zum Zitat Williams, B.: A Sampler on Sampling. Wiley, Hoboken (1978)MATH Williams, B.: A Sampler on Sampling. Wiley, Hoboken (1978)MATH
41.
Zurück zum Zitat Winship, C., Mare, R.D.: Models for sample selection bias. Annu. Rev. Sociol. 18, 327–350 (1992)CrossRef Winship, C., Mare, R.D.: Models for sample selection bias. Annu. Rev. Sociol. 18, 327–350 (1992)CrossRef
42.
Zurück zum Zitat Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In: Proceedings of the International Conference on Machine Learning (2004) Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In: Proceedings of the International Conference on Machine Learning (2004)
43.
Zurück zum Zitat Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: Proceedings of SIGIR, pp. 307–314 (1998) Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: Proceedings of SIGIR, pp. 307–314 (1998)
Metadaten
Titel
A characterization of sample selection bias in system evaluation and the case of information retrieval
verfasst von
Massimo Melucci
Publikationsdatum
05.07.2018
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 2/2018
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-018-0134-x