Skip to main content
Erschienen in: International Journal of Multimedia Information Retrieval 1/2013

01.03.2013 | Regular Paper

Minimal test collections for low-cost evaluation of Audio Music Similarity and Retrieval systems

verfasst von: Julián Urbano, Markus Schedl

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Reliable evaluation of Information Retrieval systems requires large amounts of relevance judgments. Making these annotations is not only tedious but also complex for many Music Information Retrieval tasks. As a result, performing such evaluations usually requires too much effort. A low-cost alternative is the application of Minimal Test Collections algorithms, which offer very reliable results while significantly reducing the required annotation effort. The idea is to represent effectiveness scores as random variables that can be estimated, iteratively selecting which documents to judge so that we can compute accurate estimates with a certain degree of confidence and with the least effort. In this paper we show the application of Minimal Test Collections to the evaluation of the Audio Music Similarity and Retrieval task, run by the annual MIREX evaluation campaign. An analysis with the MIREX 2007, 2009, 2010 and 2011 data shows that with as little as 2 % of the total judgments we can obtain accurate estimates of the ranking of systems. We also present a method to rank systems without making any annotations, which can be successfully used when little or no resources are available.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In early editions of MIREX it was defined from 0 to 10, with one decimal digit. Both definitions are equivalent.
 
2
The indicator functions are squared in the variance so all documents have a positive contribution to the total variance.
 
3
Note that this is rarely true in Text Information Retrieval.
 
4
Note that \(P(G_i\ge l_1|f_i)\) is always 1.
 
Literatur
1.
Zurück zum Zitat Carterette B (2007) Robust test collections for retrieval evaluation. In: International ACM SIGIR conference on research and development in information retrieval, pp 55–62 Carterette B (2007) Robust test collections for retrieval evaluation. In: International ACM SIGIR conference on research and development in information retrieval, pp 55–62
2.
Zurück zum Zitat Carterette B (2008) Low-cost and robust evaluation of information retrieval systems. Ph.D. thesis, University of Massachusetts Amherst Carterette B (2008) Low-cost and robust evaluation of information retrieval systems. Ph.D. thesis, University of Massachusetts Amherst
3.
Zurück zum Zitat Carterette B, Allan J, Sitaraman R (2006) Minimal test collections for retrieval evaluation. In: International ACM SIGIR conference on research and development in information retrieval, pp 268–275 Carterette B, Allan J, Sitaraman R (2006) Minimal test collections for retrieval evaluation. In: International ACM SIGIR conference on research and development in information retrieval, pp 268–275
4.
Zurück zum Zitat Carterette B, Jones R (2007) Evaluating search engines by modeling the relationship between relevance and clicks. In: Annual conference on neural information processing systems Carterette B, Jones R (2007) Evaluating search engines by modeling the relationship between relevance and clicks. In: Annual conference on neural information processing systems
5.
Zurück zum Zitat Carterette B, Pavlu V, Fang H, Kanoulas E (2009) Million query track 2009 overview. In: Text retrieval conference Carterette B, Pavlu V, Fang H, Kanoulas E (2009) Million query track 2009 overview. In: Text retrieval conference
7.
Zurück zum Zitat Downie JS (2004) The scientific evaluation of music information retrieval systems: foundations and future. Comput Music J 28(2):12–23CrossRef Downie JS (2004) The scientific evaluation of music information retrieval systems: foundations and future. Comput Music J 28(2):12–23CrossRef
8.
Zurück zum Zitat Downie JS, Ehmann AF, Bay M, Jones MC (2010) The music information retrieval evaluation exchange: some observations and insights. In: Zbigniew WR, Wieczorkowska AA (eds) Advances in music information retrieval. Springer, Berlin, pp 93–115 Downie JS, Ehmann AF, Bay M, Jones MC (2010) The music information retrieval evaluation exchange: some observations and insights. In: Zbigniew WR, Wieczorkowska AA (eds) Advances in music information retrieval. Springer, Berlin, pp 93–115
9.
Zurück zum Zitat Flexer A, Schnitzer D (2010) Effects of album and artist filters in audio similarity computed for very large music databases. Comput Music J 34(3):20–28CrossRef Flexer A, Schnitzer D (2010) Effects of album and artist filters in audio similarity computed for very large music databases. Comput Music J 34(3):20–28CrossRef
10.
Zurück zum Zitat Harman DK (2011) Information retrieval evaluation. Synth Lect Inf Concept Retr Serv 3(2):1–119 Harman DK (2011) Information retrieval evaluation. Synth Lect Inf Concept Retr Serv 3(2):1–119
11.
Zurück zum Zitat Jones MC, Downie JS, Ehmann AF (2007) Human similarity judgments: implications for the design of formal evaluations. In: International conference on music information retrieval, pp 539–542 Jones MC, Downie JS, Ehmann AF (2007) Human similarity judgments: implications for the design of formal evaluations. In: International conference on music information retrieval, pp 539–542
12.
Zurück zum Zitat Liu I, Agresti A (2005) The analysis of ordered categorical data: an overview and a survey of recent developments. Sociedad Estadística e Investigación Operativa Test 14(1):1–73 Liu I, Agresti A (2005) The analysis of ordered categorical data: an overview and a survey of recent developments. Sociedad Estadística e Investigación Operativa Test 14(1):1–73
13.
Zurück zum Zitat Long JS (1997) Regression models for categorical and limited dependent variables, 1st edn. Sage Publications, New York Long JS (1997) Regression models for categorical and limited dependent variables, 1st edn. Sage Publications, New York
14.
Zurück zum Zitat Pohle T (2010) Automatic characterization of music for intuitive retrieval. Ph.D. thesis, Johannes Kepler University Pohle T (2010) Automatic characterization of music for intuitive retrieval. Ph.D. thesis, Johannes Kepler University
15.
Zurück zum Zitat Salamon J, Urbano J (2012) Current challenges in the evaluation of predominant melody extraction algorithms. In: International society for music information retrieval conference, pp 289–294 Salamon J, Urbano J (2012) Current challenges in the evaluation of predominant melody extraction algorithms. In: International society for music information retrieval conference, pp 289–294
16.
Zurück zum Zitat Soboroff I, Nicholas C, Cahan P (2001) Ranking retrieval systems without relevance judgments. In: International ACM SIGIR conference on research and development in information retrieval, pp 66–73 Soboroff I, Nicholas C, Cahan P (2001) Ranking retrieval systems without relevance judgments. In: International ACM SIGIR conference on research and development in information retrieval, pp 66–73
17.
Zurück zum Zitat Urbano J (2011) Information retrieval meta-evaluation: challenges and opportunities in the music domain. In: International society for music information retrieval conference, pp 609–614 Urbano J (2011) Information retrieval meta-evaluation: challenges and opportunities in the music domain. In: International society for music information retrieval conference, pp 609–614
18.
Zurück zum Zitat Urbano J, Downie JS, Mcfee B, Schedl M (2012) How significant is statistically significant? The case of audio music similarity and retrieval. In: International society for music information retrieval conference, pp 181–186 Urbano J, Downie JS, Mcfee B, Schedl M (2012) How significant is statistically significant? The case of audio music similarity and retrieval. In: International society for music information retrieval conference, pp 181–186
19.
Zurück zum Zitat Urbano J, Martín D, Marrero M, Morato J (2011) Audio music similarity and retrieval: evaluation power and stability. In: International society for music information retrieval conference, pp 597–602 Urbano J, Martín D, Marrero M, Morato J (2011) Audio music similarity and retrieval: evaluation power and stability. In: International society for music information retrieval conference, pp 597–602
20.
Zurück zum Zitat Urbano J, Schedl M (2012) Towards minimal test collections for evaluation of audio music similarity and retrieval. In: WWW international workshop on advances in music, information research, pp 917–923 Urbano J, Schedl M (2012) Towards minimal test collections for evaluation of audio music similarity and retrieval. In: WWW international workshop on advances in music, information research, pp 917–923
21.
Zurück zum Zitat Voorhees EM (2000) Variations in relevance judgments and the measurement of retrieval effectiveness. Inf Process Manag 36(5):697–716CrossRef Voorhees EM (2000) Variations in relevance judgments and the measurement of retrieval effectiveness. Inf Process Manag 36(5):697–716CrossRef
22.
Zurück zum Zitat Voorhees EM (2002) The philosophy of information retrieval evaluation. In: Workshop of the cross-language evaluation, forum, pp 355–370 Voorhees EM (2002) The philosophy of information retrieval evaluation. In: Workshop of the cross-language evaluation, forum, pp 355–370
23.
Zurück zum Zitat Voorhees EM (2002) Whither music IR evaluation infrastructure: lessons to be learned from TREC. In: JCDL workshop on the creation of standardized test collections, tasks, and metrics for music information retrieval (MIR) and music digital library (MDL), evaluation, pp 7–13 Voorhees EM (2002) Whither music IR evaluation infrastructure: lessons to be learned from TREC. In: JCDL workshop on the creation of standardized test collections, tasks, and metrics for music information retrieval (MIR) and music digital library (MDL), evaluation, pp 7–13
24.
Zurück zum Zitat Voorhees EM, Harman DK (2005) TREC: experiment and evaluation in information retrieval. MIT Press, Cambridge Voorhees EM, Harman DK (2005) TREC: experiment and evaluation in information retrieval. MIT Press, Cambridge
25.
Zurück zum Zitat Yee T (2010) The VGAM package for categorical data analysis. J Stat Softw 32(10):1–34 Yee T (2010) The VGAM package for categorical data analysis. J Stat Softw 32(10):1–34
26.
Metadaten
Titel
Minimal test collections for low-cost evaluation of Audio Music Similarity and Retrieval systems
verfasst von
Julián Urbano
Markus Schedl
Publikationsdatum
01.03.2013
Verlag
Springer-Verlag
Erschienen in
International Journal of Multimedia Information Retrieval / Ausgabe 1/2013
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-012-0030-4

Weitere Artikel der Ausgabe 1/2013

International Journal of Multimedia Information Retrieval 1/2013 Zur Ausgabe