Skip to main content
Erschienen in: Journal of Intelligent Information Systems 1/2011

01.02.2011

Using web sources for improving video categorization

verfasst von: José M. Perea-Ortega, Arturo Montejo-Ráez, M. Teresa Martín-Valdivia, L. Alfonso Ureña-López

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 1/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, several experiments about video categorization using a supervised learning approach are presented. To this end, the VideoCLEF 2008 evaluation forum has been chosen as experimental framework. After an analysis of the VideoCLEF corpus, it was found that video transcriptions are not the best source of information in order to identify the thematic of video streams. Therefore, two web-based corpora have been generated in the aim of adding more informational sources by integrating documents from Wikipedia articles and Google searches. A number of supervised categorization experiments using the test data of VideoCLEF have been accomplished. Several machine learning algorithms have been proved to validate the effect of the corpus on the final results: Naïve Bayes, K-nearest-neighbors (KNN), Support Vectors Machine (SVM) and the j48 decision tree. The results obtained show that web can be a useful source of information for generating classification models for video data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
6
SMART Project. Stop word List for English Information Retrieval, available in http://​www.​unine.​ch/​info/​clef/​englishST.​txt.
 
7
Snowball stemmer is available in http://​snowball.​tartarus.​org/​.
 
8
RapidMiner is available from http://​rapid-i.​com/​.
 
9
Weka is a set of data mining algorithms and tools easily integrated in RapidMiner. More information is available at http://​www.​cs.​waikato.​ac.​nz/​ml/​weka/​.
 
Literatur
Zurück zum Zitat Arni, T., Clough, P., Sanderson, M., & Grubinger, M. (2009). Overview of the ImageCLEFphoto 2008 photographic retrieval task. In CLEF. Lecture notes in computer science (Vol. 5706, pp. 500–511). Springer. Arni, T., Clough, P., Sanderson, M., & Grubinger, M. (2009). Overview of the ImageCLEFphoto 2008 photographic retrieval task. In CLEF. Lecture notes in computer science (Vol. 5706, pp. 500–511). Springer.
Zurück zum Zitat Bargeron, D., Gupta, A., Grudin, J., & Sanocki, E. (1999). Annotations for streaming video on the web: System design and usage studies. In Proceedings of the eighth international world-wide web conference. Bargeron, D., Gupta, A., Grudin, J., & Sanocki, E. (1999). Annotations for streaming video on the web: System design and usage studies. In Proceedings of the eighth international world-wide web conference.
Zurück zum Zitat Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6), 391–407.CrossRef Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6), 391–407.CrossRef
Zurück zum Zitat Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2005). The University of Jaén at Imageclef 2005: Adhoc and medical tasks. In C. Peters, F. C. Gey, J. Gonzalo, H. Müller, G. J. F. Jones, M. Kluck, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 4022, pp. 612–621). Springer. Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2005). The University of Jaén at Imageclef 2005: Adhoc and medical tasks. In C. Peters, F. C. Gey, J. Gonzalo, H. Müller, G. J. F. Jones, M. Kluck, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 4022, pp. 612–621). Springer.
Zurück zum Zitat Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2006). Using information gain to improve the Imageclef 2006 collection. In C. Peters, P. Clough, F. C. Gey, J. Karlgren, B. Magnini, D. W. Oard, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 4730, pp. 711–714). Springer. Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2006). Using information gain to improve the Imageclef 2006 collection. In C. Peters, P. Clough, F. C. Gey, J. Karlgren, B. Magnini, D. W. Oard, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 4730, pp. 711–714). Springer.
Zurück zum Zitat Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2007). Integrating mesh ontology to improve medical information retrieval. In C. Peters, V. Jijkoun, T. Mandl, H. Müller, D. W. Oard, A. Peñas, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 5152, pp. 601–606). Springer. Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2007). Integrating mesh ontology to improve medical information retrieval. In C. Peters, V. Jijkoun, T. Mandl, H. Müller, D. W. Oard, A. Peñas, et al. (Eds.), CLEF. Lecture notes in computer science (Vol. 5152, pp. 601–606). Springer.
Zurück zum Zitat Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2008). SINAI at ImageCLEFmed 2008. In Proceedings of the cross language evaluation forum (CLEF 2008). Díaz-Galiano, M. C., García-Cumbreras, M. A., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2008). SINAI at ImageCLEFmed 2008. In Proceedings of the cross language evaluation forum (CLEF 2008).
Zurück zum Zitat Díaz-Galiano, M. C., Perea-Ortega, J. M., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2007). SINAI at TRECVID 2007. In Proceedings of the TRECVID 2007 workshop (TRECVID 2007). Díaz-Galiano, M. C., Perea-Ortega, J. M., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña López, L. A. (2007). SINAI at TRECVID 2007. In Proceedings of the TRECVID 2007 workshop (TRECVID 2007).
Zurück zum Zitat Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10, 1895–1923.CrossRef Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10, 1895–1923.CrossRef
Zurück zum Zitat Henning, M., Kalpathy-Cramer, J., Kahn, C. E., Hatt, W., Bedrick, S., & Hersh, W. R. (2009). Overview of the ImageCLEFmed 2008 medical image retrieval task. In CLEF. Lecture notes in computer science (Vol. 5706, pp. 512–522). Springer. Henning, M., Kalpathy-Cramer, J., Kahn, C. E., Hatt, W., Bedrick, S., & Hersh, W. R. (2009). Overview of the ImageCLEFmed 2008 medical image retrieval task. In CLEF. Lecture notes in computer science (Vol. 5706, pp. 512–522). Springer.
Zurück zum Zitat Lam, S. L. Y., & Lee, D. L. (1999). Feature reduction for neural network based text categorization. In DASFAA ’99: Proceedings of the sixth international conference on database systems for advanced applications (pp. 195–202). Washington, DC: IEEE Computer Society.CrossRef Lam, S. L. Y., & Lee, D. L. (1999). Feature reduction for neural network based text categorization. In DASFAA ’99: Proceedings of the sixth international conference on database systems for advanced applications (pp. 195–202). Washington, DC: IEEE Computer Society.CrossRef
Zurück zum Zitat Larson, M., Newman, E., & Jones, G. (2009). Overview of VideoCLEF 2008: Automatic generation of topic-based feeds for dual language audio-visual content. In Evaluating systems for multilingual and multimodal information access. Lecture notes in computer science (Vol. 5706, pp. 906–917). Springer. Larson, M., Newman, E., & Jones, G. (2009). Overview of VideoCLEF 2008: Automatic generation of topic-based feeds for dual language audio-visual content. In Evaluating systems for multilingual and multimodal information access. Lecture notes in computer science (Vol. 5706, pp. 906–917). Springer.
Zurück zum Zitat Lewis, D. D. (1991). Evaluating text categorization. In Proceedings of speech and natural language workshop (pp. 312–318). Morgan Kaufmann. Lewis, D. D. (1991). Evaluating text categorization. In Proceedings of speech and natural language workshop (pp. 312–318). Morgan Kaufmann.
Zurück zum Zitat Li, J., Chang, S. F., Lesk, M., Lienhart, R., Luo, J., & Smeulders, A. W. M. (2007) New challenges in multimedia research for the increasingly connected and fast growing digital society. In J. Z. Wang, N. Boujemaa, A. D. Bimbo, & J. Li (Eds.), Multimedia information retrieval (pp. 3–10). ACM. Li, J., Chang, S. F., Lesk, M., Lienhart, R., Luo, J., & Smeulders, A. W. M. (2007) New challenges in multimedia research for the increasingly connected and fast growing digital society. In J. Z. Wang, N. Boujemaa, A. D. Bimbo, & J. Li (Eds.), Multimedia information retrieval (pp. 3–10). ACM.
Zurück zum Zitat Martín-Valdivia, M. T., Díaz-Galiano, M. C., Montejo-Ráez, A., & Ureña López, L. A. (2008). Using information gain to improve multi-modal information retrieval systems. Information Processing and Management, 44(3), 1146–1158. Martín-Valdivia, M. T., Díaz-Galiano, M. C., Montejo-Ráez, A., & Ureña López, L. A. (2008). Using information gain to improve multi-modal information retrieval systems. Information Processing and Management, 44(3), 1146–1158.
Zurück zum Zitat Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.MATH Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.MATH
Zurück zum Zitat Montejo-Ráez, A., & Ureña López, L. A. (2006). Binary classifiers versus adaboost for labeling of digital documents. Sociedad Española para el Procesamiento del Lenguaje Natural, 37, 319–326. Montejo-Ráez, A., & Ureña López, L. A. (2006). Binary classifiers versus adaboost for labeling of digital documents. Sociedad Española para el Procesamiento del Lenguaje Natural, 37, 319–326.
Zurück zum Zitat Perea-Ortega, J. M,, Montejo-Ráez, A., Martín-Valdivia, M. T., Díaz-Galiano, M. C., & Ureña-López, L. A. (2008). SINAI at VideoCLEF 2008. In Proceedings of the cross language evaluation forum (CLEF 2008). Perea-Ortega, J. M,, Montejo-Ráez, A., Martín-Valdivia, M. T., Díaz-Galiano, M. C., & Ureña-López, L. A. (2008). SINAI at VideoCLEF 2008. In Proceedings of the cross language evaluation forum (CLEF 2008).
Zurück zum Zitat Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.CrossRef Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.CrossRef
Zurück zum Zitat Smeaton, A. F., Over, P., & Kraaij, W. (2006). Evaluation campaigns and TRECVid. In J. Z. Wang, N. Boujemaa, & Y. Chen (Eds.), Multimedia information retrieval (pp. 321–330). ACM. Smeaton, A. F., Over, P., & Kraaij, W. (2006). Evaluation campaigns and TRECVid. In J. Z. Wang, N. Boujemaa, & Y. Chen (Eds.), Multimedia information retrieval (pp. 321–330). ACM.
Zurück zum Zitat Volkmer, T., Smith, J. R., & Natsev, A. (2005). A web-based system for collaborative annotation of large image and video collections: An evaluation and user study. In ACM multimedia (pp. 892–901). ACM. Volkmer, T., Smith, J. R., & Natsev, A. (2005). A web-based system for collaborative annotation of large image and video collections: An evaluation and user study. In ACM multimedia (pp. 892–901). ACM.
Zurück zum Zitat Yamamoto, D., & Nagao, K. (2004). iVAS: Web-based video annotation system and its applications. In 3rd international semantic web conference (ISWC2004) (pp. 7–11). Yamamoto, D., & Nagao, K. (2004). iVAS: Web-based video annotation system and its applications. In 3rd international semantic web conference (ISWC2004) (pp. 7–11).
Metadaten
Titel
Using web sources for improving video categorization
verfasst von
José M. Perea-Ortega
Arturo Montejo-Ráez
M. Teresa Martín-Valdivia
L. Alfonso Ureña-López
Publikationsdatum
01.02.2011
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 1/2011
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-010-0123-6

Weitere Artikel der Ausgabe 1/2011

Journal of Intelligent Information Systems 1/2011 Zur Ausgabe