Skip to main content

2021 | OriginalPaper | Buchkapitel

An E-Commerce Dataset in French for Multi-modal Product Categorization and Cross-Modal Retrieval

verfasst von : Hesam Amoualian, Parantapa Goswami, Pradipto Das, Pablo Montalvo, Laurent Ach, Nathaniel R. Dean

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A multi-modal dataset of ninety nine thousand product listings are made available from the production catalog of Rakuten France, a major e-commerce platform. Each product in the catalog data contains a textual title, a (possibly empty) textual description and an associated image. The dataset has been released as part of a data challenge hosted by the SIGIR ECom’20 Workshop. Two tasks are proposed, namely a principal large-scale multi-modal classification task and a subsidiary cross-modal retrieval task. This real world dataset contains around 85K products and their corresponding product type categories that are released as training data and around 9.5K and 4.5K products are released as held-out test sets for the multi-modal classification and cross-modal retrieval tasks respectively. The evaluation is run in two phases to measure system performance, first on 10% of the test data, and then on the rest 90% of the test data. The different systems are evaluated using macro-F1 score for the multi-modal classification task and recall@1 for the cross-modal retrieval task. Additionally, a robust baseline system for the multi-modal classification task is proposed. The top performance obtained at the end of the second phase is \(91.44\%\) macro-F1 and \(34.28\%\) recall@1 for the two tasks respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Gross Merchandise Volume (GMV) is the total monetary value for merchandise sold through a particular marketplace over a certain period of time.
 
Literatur
3.
Zurück zum Zitat Cardoso, Â., Daolio, F., Vargas, S.: Product characterisation towards personalisation: learning attributes from unstructured data to recommend fashion products. In: Proceedings of the 24th ACM International Conference on Knowledge Discovery & Data Mining (SIGKDD), pp. 80–89 (2018) Cardoso, Â., Daolio, F., Vargas, S.: Product characterisation towards personalisation: learning attributes from unstructured data to recommend fashion products. In: Proceedings of the 24th ACM International Conference on Knowledge Discovery & Data Mining (SIGKDD), pp. 80–89 (2018)
4.
5.
Zurück zum Zitat Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018) Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018)
6.
7.
Zurück zum Zitat Duong, C.T., Lebret, R., Aberer, K.: Multimodal classification for analysing social media, CoRR abs/1708.02099 (2017) Duong, C.T., Lebret, R., Aberer, K.: Multimodal classification for analysing social media, CoRR abs/1708.02099 (2017)
8.
Zurück zum Zitat Dąbrowski, J., et al.: An efficient manifold density estimator for all recommendation systems (2020) Dąbrowski, J., et al.: An efficient manifold density estimator for all recommendation systems (2020)
9.
Zurück zum Zitat Faghri, F., Fleet, D.J., Kiros, J.R., Fidler, S.: VSE++: improved visual-semantic embeddings, CoRR abs/1707.05612 (2017) Faghri, F., Fleet, D.J., Kiros, J.R., Fidler, S.: VSE++: improved visual-semantic embeddings, CoRR abs/1707.05612 (2017)
10.
Zurück zum Zitat Han, X., et al.: Automatic spatially-aware fashion concept discovery (2017) Han, X., et al.: Automatic spatially-aware fashion concept discovery (2017)
11.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
13.
Zurück zum Zitat Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018) Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018)
14.
15.
Zurück zum Zitat Kiela, D., Bhooshan, S., Firooz, H., Testuggine, D.: Supervised multimodal bitransformers for classifying images and text (2019) Kiela, D., Bhooshan, S., Firooz, H., Testuggine, D.: Supervised multimodal bitransformers for classifying images and text (2019)
16.
Zurück zum Zitat Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models, CoRR abs/1411.2539 (2014) Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models, CoRR abs/1411.2539 (2014)
17.
Zurück zum Zitat Kolesnikov, A., et al.: Big transfer (BiT): general visual representation learning (2019) Kolesnikov, A., et al.: Big transfer (BiT): general visual representation learning (2019)
18.
Zurück zum Zitat Le, H., et al.: FlauBERT: unsupervised language model pre-training for French. In: Proceedings of the 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, 11–16 May 2020, pp. 2479–2490. European Language Resources Association (2020) Le, H., et al.: FlauBERT: unsupervised language model pre-training for French. In: Proceedings of the 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, 11–16 May 2020, pp. 2479–2490. European Language Resources Association (2020)
19.
Zurück zum Zitat Lin, Y.C., Das, P., Trotman, A., Kallumadi, S.: A dataset and baselines for e-commerce product categorization. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2019, pp. 213–216. Association for Computing Machinery, New York (2019) Lin, Y.C., Das, P., Trotman, A., Kallumadi, S.: A dataset and baselines for e-commerce product categorization. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2019, pp. 213–216. Association for Computing Machinery, New York (2019)
21.
Zurück zum Zitat McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes (2015) McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes (2015)
22.
Zurück zum Zitat Park, G., Han, C., Yoon, W., Kim, D.: MHSAN: multi-head self-attention network for visual semantic embedding, CoRR abs/2001.03712 (2020) Park, G., Han, C., Yoon, W., Kim, D.: MHSAN: multi-head self-attention network for visual semantic embedding, CoRR abs/2001.03712 (2020)
23.
Zurück zum Zitat Qi, D., Su, L., Song, J., Cui, E., Bharti, T., Sacheti, A.: ImageBERT: cross-modal pre-training with large-scale weak-supervised image-text data, CoRR abs/2001.07966 (2020) Qi, D., Su, L., Song, J., Cui, E., Bharti, T., Sacheti, A.: ImageBERT: cross-modal pre-training with large-scale weak-supervised image-text data, CoRR abs/2001.07966 (2020)
24.
Zurück zum Zitat Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2019)
25.
Zurück zum Zitat Sidorov, M.: Attribute extraction from ecommerce product descriptions. CS229 (2018) Sidorov, M.: Attribute extraction from ecommerce product descriptions. CS229 (2018)
26.
Metadaten
Titel
An E-Commerce Dataset in French for Multi-modal Product Categorization and Cross-Modal Retrieval
verfasst von
Hesam Amoualian
Parantapa Goswami
Pradipto Das
Pablo Montalvo
Laurent Ach
Nathaniel R. Dean
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-72113-8_2

Neuer Inhalt