Skip to main content
Top
Published in: Multimedia Systems 4/2023

12-04-2023 | Regular Paper

The segmented UEC Food-100 dataset with benchmark experiment on food detection

Authors: Elena Battini Sönmez, Sefer Memiş, Berker Arslan, Okan Zafer Batur

Published in: Multimedia Systems | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Automatic food classification systems have several interesting applications ranging from detecting eating habits, to waste food management and advertisement. When a food image has multiple food items, the food detection step is necessary before classification. This work challenges the food detection issue and it introduces to the research community the Segmented UEC Food-100 dataset, which expands the original UEC Food-100 database with segmentation masks. In the semantic segmentation experiment, the performance of YOLAC and DeeplabV3+ has been compared and YOLAC reached the best accuracy of 64.63% mIoU. In the instance segmentation experiment, YOLACT has been used due to its speed and high accuracy. The benchmark performance on the newly released Segmented UEC Food-100 dataset is 68.83% mAP. For comparison purpose, experiments have been run also on the UEC FoodPix Complete dataset of Okamoto et al. The database and the code will be available after publication.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Dai, J., Li, Y., He, K., Sun, J.: R-fcn: object detection via region-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. NIPS’16, pp. 379–387. Curran Associates Inc., Red Hook (2016) Dai, J., Li, Y., He, K., Sun, J.: R-fcn: object detection via region-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. NIPS’16, pp. 379–387. Curran Associates Inc., Red Hook (2016)
2.
go back to reference Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
3.
go back to reference Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018) Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
4.
go back to reference Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra r-cnn: towards balanced learning for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 821–830. IEEE Computer Society, Los Alamitos (2019) Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra r-cnn: towards balanced learning for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 821–830. IEEE Computer Society, Los Alamitos (2019)
5.
go back to reference Wang, X., Zhang, S., Yu, Z., Feng, L., Zhang, W.: Scale-equalizing pyramid convolution for object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 13–19, 2020, pp. 13356–13365 (2020) Wang, X., Zhang, S., Yu, Z., Feng, L., Zhang, W.: Scale-equalizing pyramid convolution for object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 13–19, 2020, pp. 13356–13365 (2020)
6.
go back to reference Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., Zhang, L.: Dynamic head: unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021) Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., Zhang, L.: Dynamic head: unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
7.
go back to reference Singh, B., Najibi, M., Sharma, A., Davis, L.S.: Scale normalized image pyramids with autofocus for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3749–3766 (2022) Singh, B., Najibi, M., Sharma, A., Davis, L.S.: Scale normalized image pyramids with autofocus for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3749–3766 (2022)
8.
go back to reference Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. In: 2012 IEEE International Conference on Multimedia and Expo, pp. 25–30 (2012) Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. In: 2012 IEEE International Conference on Multimedia and Expo, pp. 25–30 (2012)
9.
go back to reference Okamoto, K., Yanai, K.: UEC-FoodPIX Complete: A large-scale food image segmentation dataset. In: Proceedings of ICPR Workshop on Multimedia Assisted Dietary Management(MADiMa) (2021) Okamoto, K., Yanai, K.: UEC-FoodPIX Complete: A large-scale food image segmentation dataset. In: Proceedings of ICPR Workshop on Multimedia Assisted Dietary Management(MADiMa) (2021)
10.
go back to reference Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef
11.
go back to reference Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 800–810 (2001)CrossRef Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 800–810 (2001)CrossRef
12.
go back to reference Ege, T., Yanai, K.: A new large-scale food image segmentation dataset and its application to food calorie estimation based on grains of rice. In: Proceedings of ACMMM Workshop on Multimedia Assisted Dietary Management (MADiMa) (2019) Ege, T., Yanai, K.: A new large-scale food image segmentation dataset and its application to food calorie estimation based on grains of rice. In: Proceedings of ACMMM Workshop on Multimedia Assisted Dietary Management (MADiMa) (2019)
13.
go back to reference Rother, C., Kolmogorov, V., Blake, A.: “grabcut’’: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef Rother, C., Kolmogorov, V., Blake, A.: “grabcut’’: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef
14.
go back to reference Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation (2018) Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation (2018)
16.
go back to reference Chen, M.-Y., Yang, Y.-H., Ho, C.-J., Wang, S.-H., Liu, S.-M., Chang, E., Yeh, C.-H., Ouhyoung, M.: Automatic Chinese food identification and quantity estimation. In: SIGGRAPH Asia 2012 Technical Briefs. SA ’12. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2407746.2407775 Chen, M.-Y., Yang, Y.-H., Ho, C.-J., Wang, S.-H., Liu, S.-M., Chang, E., Yeh, C.-H., Ouhyoung, M.: Automatic Chinese food identification and quantity estimation. In: SIGGRAPH Asia 2012 Technical Briefs. SA ’12. Association for Computing Machinery, New York (2012). https://​doi.​org/​10.​1145/​2407746.​2407775
17.
go back to reference Wu, X., Fu, X., Liu, Y., Lim, E., Hoi, S.C.H., Sun, Q.: A large-scale benchmark for food image segmentation (2021). CoRR arXiv:2105.05409 Wu, X., Fu, X., Liu, Y., Lim, E., Hoi, S.C.H., Sun, Q.: A large-scale benchmark for food image segmentation (2021). CoRR arXiv:​2105.​05409
18.
go back to reference Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., Torralba, A.: Learning cross-modal embeddings for cooking recipes and food images. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3068–3076 (2017). https://doi.org/10.1109/CVPR.2017.327 Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., Torralba, A.: Learning cross-modal embeddings for cooking recipes and food images. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3068–3076 (2017). https://​doi.​org/​10.​1109/​CVPR.​2017.​327
19.
go back to reference Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRef Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRef
21.
go back to reference Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollár, P.: Microsoft coco: common objects in context. In: Proceedings of European Conference on Computer Vision (2014) Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollár, P.: Microsoft coco: common objects in context. In: Proceedings of European Conference on Computer Vision (2014)
22.
go back to reference Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef
23.
go back to reference Russakovsky, O., Deng, J., Huang, Z., Berg, A.C., Fei-Fei, L.: Detecting avocados to zucchinis: what have we done, and where are we going? In: International Conference on Computer Vision (ICCV) (2013) Russakovsky, O., Deng, J., Huang, Z., Berg, A.C., Fei-Fei, L.: Detecting avocados to zucchinis: what have we done, and where are we going? In: International Conference on Computer Vision (ICCV) (2013)
24.
go back to reference Deng, J., Socher, R., Fei-Fei, L., Dong, W., Li, K., Li, L.-J.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009) Deng, J., Socher, R., Fei-Fei, L., Dong, W., Li, K., Li, L.-J.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
25.
go back to reference Gao, J., Tan, W., Ma, L., Wang, Y., Tang, W.: Musefood: multi-sensor-based food volume estimation on smartphones. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 899–906 (2019) Gao, J., Tan, W., Ma, L., Wang, Y., Tang, W.: Musefood: multi-sensor-based food volume estimation on smartphones. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 899–906 (2019)
26.
go back to reference Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. Curran Associates, Inc. (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. Curran Associates, Inc. (2015)
27.
go back to reference Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)CrossRef Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)CrossRef
28.
go back to reference Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015 (2015) Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015 (2015)
29.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
32.
go back to reference Chollet, F.: Xception: deep learning with depthwise separable convolutions (2017) Chollet, F.: Xception: deep learning with depthwise separable convolutions (2017)
33.
go back to reference Jiang, S., Min, W., Liu, L., Luo, Z.: Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans. Image Process. 29, 265–276 (2020)MathSciNetCrossRefMATH Jiang, S., Min, W., Liu, L., Luo, Z.: Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans. Image Process. 29, 265–276 (2020)MathSciNetCrossRefMATH
Metadata
Title
The segmented UEC Food-100 dataset with benchmark experiment on food detection
Authors
Elena Battini Sönmez
Sefer Memiş
Berker Arslan
Okan Zafer Batur
Publication date
12-04-2023
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 4/2023
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-023-01088-9

Other articles of this Issue 4/2023

Multimedia Systems 4/2023 Go to the issue