Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 12/2020

24-06-2020 | Original Article

Benchmarking algorithms for food localization and semantic segmentation

Authors: Sinem Aslan, Gianluigi Ciocca, Davide Mazzini, Raimondo Schettini

Published in: International Journal of Machine Learning and Cybernetics | Issue 12/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The problem of food segmentation is quite challenging since food is characterized by intrinsic high intra-class variability. Also, segmentation of food images taken in-the-wild may be characterized by acquisition artifacts, and that could be problematic for the segmentation algorithms. A proper evaluating of segmentation algorithms is of paramount importance for the design and improvement of food analysis systems that can work in less-than-ideal real scenarios. In this paper, we evaluate the performance of different deep learning-based segmentation algorithms in the context of food. Due to the lack of large-scale food segmentation datasets, we initially create a new dataset composed of 5000 images of 50 diverse food categories. The images are accurately annotated with pixel-wise annotations. In order to test the algorithms under different conditions, the dataset is augmented with the same images but rendered under different acquisition distortions that comprise illuminant change, JPEG compression, Gaussian noise, and Gaussian blur. The final dataset is composed of 120,000 images. Using standard benchmark measures, we conducted extensive experiments to evaluate ten state-of-the-art segmentation algorithms on two tasks: food localization and semantic food segmentation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Aguilar E, nos MB, Radeva P (2019) Regularized uncertainty-based multi-task learning model for food analysis. J Vis Commun Image Represent 60:360–370CrossRef Aguilar E, nos MB, Radeva P (2019) Regularized uncertainty-based multi-task learning model for food analysis. J Vis Commun Image Represent 60:360–370CrossRef
2.
go back to reference Aguilar E, Remeseiro B, Bolaños M, Radeva P (2018) Grab, pay, and eat: semantic food detection for smart restaurants. IEEE Trans Multim 20(12):3266–3275CrossRef Aguilar E, Remeseiro B, Bolaños M, Radeva P (2018) Grab, pay, and eat: semantic food detection for smart restaurants. IEEE Trans Multim 20(12):3266–3275CrossRef
3.
go back to reference Anthimopoulos M, Dehais J, Diem P, Mougiakakou S (2013) Segmentation and recognition of multi-food meal images for carbohydrate counting. In: 13th IEEE International Conference on BioInformatics and BioEngineering. IEEE, pp 1–4 Anthimopoulos M, Dehais J, Diem P, Mougiakakou S (2013) Segmentation and recognition of multi-food meal images for carbohydrate counting. In: 13th IEEE International Conference on BioInformatics and BioEngineering. IEEE, pp 1–4
4.
go back to reference Anthimopoulos MM, Gianola L, Scarnato L, Diem P, Mougiakakou SG (2014) A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE J Biomed Health Inf 18(4):1261–1271CrossRef Anthimopoulos MM, Gianola L, Scarnato L, Diem P, Mougiakakou SG (2014) A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE J Biomed Health Inf 18(4):1261–1271CrossRef
5.
go back to reference Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916CrossRef Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916CrossRef
6.
go back to reference Aslan S, Ciocca G, Schettini R (2017) On comparing color spaces for food segmentation. In: Int. Conf. on Image Analysis and Processing, pp 435–443 Aslan S, Ciocca G, Schettini R (2017) On comparing color spaces for food segmentation. In: Int. Conf. on Image Analysis and Processing, pp 435–443
7.
go back to reference Aslan S, Ciocca G, Schettini R (2018) Semantic food segmentation for automatic dietary monitoring. In: IEEE 8th International Conference on consumer electronics, pp 1–4 Aslan S, Ciocca G, Schettini R (2018) Semantic food segmentation for automatic dietary monitoring. In: IEEE 8th International Conference on consumer electronics, pp 1–4
8.
go back to reference Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef
9.
go back to reference Bettadapura V, Thomaz E, Parnami A, Abowd GD, Essa I (2015) Leveraging context to support automated food recognition in restaurants. In: 2015 IEEE Winter Conference on applications of computer vision. IEEE, pp 580–587 Bettadapura V, Thomaz E, Parnami A, Abowd GD, Essa I (2015) Leveraging context to support automated food recognition in restaurants. In: 2015 IEEE Winter Conference on applications of computer vision. IEEE, pp 580–587
10.
go back to reference Bianco S, Celona L, Schettini R (2016) Robust smile detection using convolutional neural networks. J Electron Imaging 25(6):063002CrossRef Bianco S, Celona L, Schettini R (2016) Robust smile detection using convolutional neural networks. J Electron Imaging 25(6):063002CrossRef
11.
go back to reference Bianco S, Cusano C, Napoletano P, Schettini R (2013) On the robustness of color texture descriptors across illuminants. In: International Conference on image analysis and processing. Springer, pp 652–662 Bianco S, Cusano C, Napoletano P, Schettini R (2013) On the robustness of color texture descriptors across illuminants. In: International Conference on image analysis and processing. Springer, pp 652–662
12.
go back to reference Bolanos M, Radeva P (2016) Simultaneous food localization and recognition. In: 23rd IEE Int. Conf. on Pattern Recognition (ICPR), pp 3140–3145 Bolanos M, Radeva P (2016) Simultaneous food localization and recognition. In: 23rd IEE Int. Conf. on Pattern Recognition (ICPR), pp 3140–3145
13.
go back to reference Bossard L, Guillaumin M, Gool LV (2014) Food-101–mining discriminative components with random forests. In: European Conf. on computer vision (ECCV). Springer, pp 446–461 Bossard L, Guillaumin M, Gool LV (2014) Food-101–mining discriminative components with random forests. In: European Conf. on computer vision (ECCV). Springer, pp 446–461
14.
go back to reference Chen J, Ngo CW (2016) Deep-based ingredient recognition for cooking recipe retrieval. In: Proc. of the 2016 ACM on Multimedia Conference. ACM, pp 32–41 Chen J, Ngo CW (2016) Deep-based ingredient recognition for cooking recipe retrieval. In: Proc. of the 2016 ACM on Multimedia Conference. ACM, pp 32–41
15.
go back to reference Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR abs/1412.7062 Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR abs/1412.7062
16.
go back to reference Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
17.
go back to reference Chen M, Dhingra K, Wu W, Yang L, Sukthankar R, Yang J (2009) Pfid: Pittsburgh fast-food image dataset. In: Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, pp 289–292 Chen M, Dhingra K, Wu W, Yang L, Sukthankar R, Yang J (2009) Pfid: Pittsburgh fast-food image dataset. In: Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, pp 289–292
18.
go back to reference Chen MY, Yang YH, Ho CJ, Wang SH, Liu SM, Chang E, Yeh CH, Ouhyoung M (2012) Automatic chinese food identification and quantity estimation. In: SIGGRAPH Asia 2012 Technical Briefs. ACM, p 29 Chen MY, Yang YH, Ho CJ, Wang SH, Liu SM, Chang E, Yeh CH, Ouhyoung M (2012) Automatic chinese food identification and quantity estimation. In: SIGGRAPH Asia 2012 Technical Briefs. ACM, p 29
19.
go back to reference Chen X, Zhu Y, Zhou H, Diao L, Wang D (2017) Chinesefoodnet: A large-scale image dataset for chinese food recognition. arXiv preprint arXiv:1705.02743 Chen X, Zhu Y, Zhou H, Diao L, Wang D (2017) Chinesefoodnet: A large-scale image dataset for chinese food recognition. arXiv preprint arXiv:​1705.​02743
20.
go back to reference Ciocca G, Corchs S, Gasparini F, Schettini R (2014) How to assess image quality within a workflow chain: an overview. Int J Digit Libr 15(1):1–25CrossRef Ciocca G, Corchs S, Gasparini F, Schettini R (2014) How to assess image quality within a workflow chain: an overview. Int J Digit Libr 15(1):1–25CrossRef
21.
go back to reference Ciocca G, Napoletano P, Schettini R (2015) Food recognition and leftover estimation for daily diet monitoring. In: New Trends in image analysis and processing—ICIAP 2015 Workshops, Lecture Notes in Computer Science, vol. 9281, pp 334–341 Ciocca G, Napoletano P, Schettini R (2015) Food recognition and leftover estimation for daily diet monitoring. In: New Trends in image analysis and processing—ICIAP 2015 Workshops, Lecture Notes in Computer Science, vol. 9281, pp 334–341
23.
go back to reference Ciocca G, Napoletano P, Schettini R (2017) Food recognition: a new dataset, experiments and results. IEEE J Biomed Health Inf 21(3):588–598CrossRef Ciocca G, Napoletano P, Schettini R (2017) Food recognition: a new dataset, experiments and results. IEEE J Biomed Health Inf 21(3):588–598CrossRef
24.
go back to reference Ciocca G, Napoletano P, Schettini R (2017) Learning cnn-based features for retrieval of food images. In: New Trends in image analysis and processing–ICIAP 2017, pp 426–434 Ciocca G, Napoletano P, Schettini R (2017) Learning cnn-based features for retrieval of food images. In: New Trends in image analysis and processing–ICIAP 2017, pp 426–434
25.
go back to reference Ciocca G, Napoletano P, Schettini R (2018) Cnn-based features for retrieval and classification of food images. Comput Vis Image Underst 176–177:70–77CrossRef Ciocca G, Napoletano P, Schettini R (2018) Cnn-based features for retrieval and classification of food images. Comput Vis Image Underst 176–177:70–77CrossRef
26.
go back to reference Corchs S, Gasparini F (2017) A multidistortion database for image quality. In: International Workshop on computational color imaging. Springer, pp 95–104 Corchs S, Gasparini F (2017) A multidistortion database for image quality. In: International Workshop on computational color imaging. Springer, pp 95–104
27.
go back to reference Cusano C, Napoletano P, Schettini R (2014) Combining local binary pattern and local color contrast for texture classification under varying illumination. J Opt Soc Am A 31(7):1453–1461CrossRef Cusano C, Napoletano P, Schettini R (2014) Combining local binary pattern and local color contrast for texture classification under varying illumination. J Opt Soc Am A 31(7):1453–1461CrossRef
28.
go back to reference Dehais J, Anthimopoulos M, Mougiakakou S (2016) Food image segmentation for dietary assessment. In: 2nd Int. Workshop on multimedia assisted dietary management, pp 23–28 Dehais J, Anthimopoulos M, Mougiakakou S (2016) Food image segmentation for dietary assessment. In: 2nd Int. Workshop on multimedia assisted dietary management, pp 23–28
29.
go back to reference Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 248–255 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 248–255
30.
go back to reference Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef
31.
go back to reference Fang S, Liu C, Tahboub K, Zhu F, Delp EJ, Boushey CJ (2018) ctada: The design of a crowdsourcing tool for online food image identification and segmentation. In: 2018 IEEE Southwest Symposium on image analysis and interpretation (SSIAI), pp 25–28 Fang S, Liu C, Tahboub K, Zhu F, Delp EJ, Boushey CJ (2018) ctada: The design of a crowdsourcing tool for online food image identification and segmentation. In: 2018 IEEE Southwest Symposium on image analysis and interpretation (SSIAI), pp 25–28
32.
go back to reference Farinella G, Moltisanti M, Battiato S (2014) Classifying food images represented as bag of textons. In: Image Processing (ICIP), 2014 IEEE International Conference on, pp 5212–5216 Farinella G, Moltisanti M, Battiato S (2014) Classifying food images represented as bag of textons. In: Image Processing (ICIP), 2014 IEEE International Conference on, pp 5212–5216
33.
go back to reference Farinella GM, Allegra D, Moltisanti M, Stanco F, Battiato S (2016) Retrieval and classification of food images. Comput Biol Med 77:23–39CrossRef Farinella GM, Allegra D, Moltisanti M, Stanco F, Battiato S (2016) Retrieval and classification of food images. Comput Biol Med 77:23–39CrossRef
34.
go back to reference Fu Z, Chen D, Li H (2017) Chinfood1000: A large benchmark dataset for chinese food recognition. In: International Conference on intelligent computing. Springer, pp 273–281 Fu Z, Chen D, Li H (2017) Chinfood1000: A large benchmark dataset for chinese food recognition. In: International Conference on intelligent computing. Springer, pp 273–281
35.
go back to reference Gao J, Tan W, Ma L, Wang Y, Tang W (2019) Musefood: multi-sensor-based food volume estimation on smartphones. arXiv, CoRR abs/1903.07437 Gao J, Tan W, Ma L, Wang Y, Tang W (2019) Musefood: multi-sensor-based food volume estimation on smartphones. arXiv, CoRR abs/1903.07437
36.
go back to reference Ghadiyaram D, Bovik AC (2015) Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans Image Process 25(1):372–387MathSciNetCrossRef Ghadiyaram D, Bovik AC (2015) Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans Image Process 25(1):372–387MathSciNetCrossRef
37.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on computer vision and pattern recognition (CVPR), pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on computer vision and pattern recognition (CVPR), pp 580–587
38.
go back to reference Hoashi H, Joutou T, Yanai K (2010) Image recognition of 85 food categories by feature fusion. In: Multimedia (ISM), 2010 IEEE International Symposium on, pp 296–301. IEEE Hoashi H, Joutou T, Yanai K (2010) Image recognition of 85 food categories by feature fusion. In: Multimedia (ISM), 2010 IEEE International Symposium on, pp 296–301. IEEE
39.
go back to reference Aslan S, Ciocca G, Schettini R (2018) Semantic segmentation of food images for automatic dietary monitoring. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). IEEE, pp 1–4 Aslan S, Ciocca G, Schettini R (2018) Semantic segmentation of food images for automatic dietary monitoring. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). IEEE, pp 1–4
40.
go back to reference Ege T, Yanai K (2018) Multi-task learning of dish detection and calorie estimation. In: Proceedings of the Joint Workshop on multimedia for cooking and eating activities and multimedia assisted dietary management. ACM, pp 53–58 Ege T, Yanai K (2018) Multi-task learning of dish detection and calorie estimation. In: Proceedings of the Joint Workshop on multimedia for cooking and eating activities and multimedia assisted dietary management. ACM, pp 53–58
41.
go back to reference Inunganbi S, Seal A, Khanna P (2018) Classification of food images through interactive image segmentation. In: Intelligent Information and Database Systems, pp 519–528 Inunganbi S, Seal A, Khanna P (2018) Classification of food images through interactive image segmentation. In: Intelligent Information and Database Systems, pp 519–528
42.
go back to reference Jiang S, Min W, Liu L, Luo Z (2019) Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans Image Process 29:265–276MathSciNetCrossRef Jiang S, Min W, Liu L, Luo Z (2019) Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans Image Process 29:265–276MathSciNetCrossRef
43.
go back to reference Joutou T, Yanai K (2009) A food image recognition system with multiple kernel learning. In: Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, pp 285–288 Joutou T, Yanai K (2009) A food image recognition system with multiple kernel learning. In: Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, pp 285–288
44.
go back to reference Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 1085–1088 Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 1085–1088
45.
go back to reference Kaur P, Sikka K, Wang W, Belongie S, Divakaran A (2019) Foodx-251: a dataset for fine-grained food classification. arXiv preprint arXiv:1907.06167 Kaur P, Sikka K, Wang W, Belongie S, Divakaran A (2019) Foodx-251: a dataset for fine-grained food classification. arXiv preprint arXiv:​1907.​06167
46.
go back to reference Kawano Y, Yanai K (2014) Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In: Proc. of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV), pp 3–17 Kawano Y, Yanai K (2014) Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In: Proc. of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV), pp 3–17
47.
go back to reference Kinga D, Ba L (2015) Adam: a method for stochastic optimization. In: International Conference on learning representations (ICLR), vol. 5 Kinga D, Ba L (2015) Adam: a method for stochastic optimization. In: International Conference on learning representations (ICLR), vol. 5
48.
go back to reference Koziarski M, Cyganek B (2017) Image recognition with deep neural networks in presence of noise-dealing with and taking advantage of distortions. Integr Comput Aided Eng 24(4):337–349CrossRef Koziarski M, Cyganek B (2017) Image recognition with deep neural networks in presence of noise-dealing with and taking advantage of distortions. Integr Comput Aided Eng 24(4):337–349CrossRef
49.
go back to reference Liu C, Cao Y, Luo Y, Chen G, Vokkarane V, Yunsheng M, Chen S, Hou P (2018) A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure. IEEE Trans Serv Comput 11(2):249–261CrossRef Liu C, Cao Y, Luo Y, Chen G, Vokkarane V, Yunsheng M, Chen S, Hou P (2018) A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure. IEEE Trans Serv Comput 11(2):249–261CrossRef
50.
go back to reference Lo SY, Hang HM, Chan SW, Lin JJ (2018) Efficient dense modules of asymmetric convolution for real-time semantic segmentation. arXiv preprint arXiv:1809.06323 Lo SY, Hang HM, Chan SW, Lin JJ (2018) Efficient dense modules of asymmetric convolution for real-time semantic segmentation. arXiv preprint arXiv:​1809.​06323
51.
go back to reference Lu Y, Allegra D, Anthimopoulos M, Stanco F, Farinella GM, Mougiakakou S (2018) A multi-task learning approach for meal assessment. In: Proceedings of the Joint Workshop on multimedia for cooking and eating activities and multimedia assisted dietary management, CEA/MADiMa ’18, pp 46–52 Lu Y, Allegra D, Anthimopoulos M, Stanco F, Farinella GM, Mougiakakou S (2018) A multi-task learning approach for meal assessment. In: Proceedings of the Joint Workshop on multimedia for cooking and eating activities and multimedia assisted dietary management, CEA/MADiMa ’18, pp 46–52
52.
go back to reference Mariappan A, Bosch M, Zhu F, Boushey CJ, Kerr DA, Ebert DS, Delp EJ (2009) Personal dietary assessment using mobile devices. In: Proceedings of SPIE–the International Society for Optical Engineering, vol. 7246, pp 72460Z–72460Z–12 Mariappan A, Bosch M, Zhu F, Boushey CJ, Kerr DA, Ebert DS, Delp EJ (2009) Personal dietary assessment using mobile devices. In: Proceedings of SPIE–the International Society for Optical Engineering, vol. 7246, pp 72460Z–72460Z–12
53.
go back to reference Martinel N, Foresti GL, Micheloni C (2018) Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on applications of computer vision (WACV), pp 567–576 Martinel N, Foresti GL, Micheloni C (2018) Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on applications of computer vision (WACV), pp 567–576
54.
go back to reference Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. In: 2012 IEEE International Conference on Multimedia and Expo. IEEE, pp 25–30 Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. In: 2012 IEEE International Conference on Multimedia and Expo. IEEE, pp 25–30
55.
go back to reference Mazzini D (2018) Guided upsampling network for real-time semantic segmentation. In: British Machine Vision Conference (BMVC), p 117 Mazzini D (2018) Guided upsampling network for real-time semantic segmentation. In: British Machine Vision Conference (BMVC), p 117
56.
go back to reference Mazzini D, Raimondo S (2019) Spatial sampling network for fast scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 97–107 Mazzini D, Raimondo S (2019) Spatial sampling network for fast scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 97–107
57.
go back to reference Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. arXiv preprint arXiv:1803.06815 Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. arXiv preprint arXiv:​1803.​06815
58.
go back to reference Meilǎ M (2005) Comparing clusterings: an axiomatic view. In: Proceedings of the 22nd international conference on Machine learning. ACM, pp 577–584 Meilǎ M (2005) Comparing clusterings: an axiomatic view. In: Proceedings of the 22nd international conference on Machine learning. ACM, pp 577–584
59.
go back to reference Mezgec S, Koroušić Seljak B (2017) Nutrinet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9(7):657CrossRef Mezgec S, Koroušić Seljak B (2017) Nutrinet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9(7):657CrossRef
60.
go back to reference Min W, Bao B, Mei S, Zhu Y, Rui Y, Jiang S (2018) You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans Multim 20(4):950–964CrossRef Min W, Bao B, Mei S, Zhu Y, Rui Y, Jiang S (2018) You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans Multim 20(4):950–964CrossRef
61.
go back to reference Min W, Jiang S, Liu L, Rui Y, Jain R (2019) A survey on food computing. ACM Comput Surv (CSUR) 52(5):1–36CrossRef Min W, Jiang S, Liu L, Rui Y, Jain R (2019) A survey on food computing. ACM Comput Surv (CSUR) 52(5):1–36CrossRef
62.
go back to reference Min W, Jiang S, Sang J, Wang H, Liu X, Herranz L (2017) Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans Multim 19(5):1100–1113CrossRef Min W, Jiang S, Sang J, Wang H, Liu X, Herranz L (2017) Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans Multim 19(5):1100–1113CrossRef
63.
go back to reference Min W, Liu L, Luo Z, Jiang S (2019) Ingredient-guided cascaded multi-attention network for food recognition. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 1331–1339 Min W, Liu L, Luo Z, Jiang S (2019) Ingredient-guided cascaded multi-attention network for food recognition. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 1331–1339
64.
go back to reference Ming ZY, Chen J, Cao Y, Forde C, Ngo CW, Chua TS (2018) Food photo recognition for dietary tracking: System and experiment. In: International Conference on Multimedia Modeling, pp 129–141 Ming ZY, Chen J, Cao Y, Forde C, Ngo CW, Chua TS (2018) Food photo recognition for dietary tracking: System and experiment. In: International Conference on Multimedia Modeling, pp 129–141
65.
go back to reference Myers A, Johnston N, Rathod V, Korattikara A, Gorban A, Silberman N, Guadarrama S, Papandreou G, Huang J, Murphy K (2015) Im2calories: Towards an automated mobile vision food diary. In: IEEE Int. Conf. on Computer Vision (ICCV), pp 1233–1241 Myers A, Johnston N, Rathod V, Korattikara A, Gorban A, Silberman N, Guadarrama S, Papandreou G, Huang J, Murphy K (2015) Im2calories: Towards an automated mobile vision food diary. In: IEEE Int. Conf. on Computer Vision (ICCV), pp 1233–1241
66.
go back to reference Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147 Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:​1606.​02147
67.
go back to reference Qiu J, Lo FPW, Sun Y, Lo B (2019) Mining discriminative food regions for accurate food recognition. In: British Machine Vision Conference Qiu J, Lo FPW, Sun Y, Lo B (2019) Mining discriminative food regions for accurate food recognition. In: British Machine Vision Conference
68.
go back to reference Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850CrossRef Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850CrossRef
69.
go back to reference Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272CrossRef Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272CrossRef
70.
go back to reference Salvador A, Hynes N, Aytar Y, Marin J, Ofli F, Weber I, Torralba A (2017) Learning cross-modal embeddings for cooking recipes and food images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3068–3076 Salvador A, Hynes N, Aytar Y, Marin J, Ofli F, Weber I, Torralba A (2017) Learning cross-modal embeddings for cooking recipes and food images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3068–3076
71.
go back to reference Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans Image Process 15(11):3440–3451CrossRef Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans Image Process 15(11):3440–3451CrossRef
72.
go back to reference Shimoda W, Yanai K (2015) Cnn-based food image segmentation without pixel-wise annotation. In: International Conference on image analysis and processing. Springer, pp 449–457 Shimoda W, Yanai K (2015) Cnn-based food image segmentation without pixel-wise annotation. In: International Conference on image analysis and processing. Springer, pp 449–457
73.
74.
go back to reference Subhi MA, Ali SH, Mohammed MA (2019) Vision-based approaches for automatic food recognition and dietary assessment: a survey. IEEE Access 7:35370–35381CrossRef Subhi MA, Ali SH, Mohammed MA (2019) Vision-based approaches for automatic food recognition and dietary assessment: a survey. IEEE Access 7:35370–35381CrossRef
75.
go back to reference Tanno R, Okamoto K, Yanai K (2016) Deepfoodcam: A dcnn-based real-time mobile food recognition system. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management. ACM, pp 89–89 Tanno R, Okamoto K, Yanai K (2016) Deepfoodcam: A dcnn-based real-time mobile food recognition system. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management. ACM, pp 89–89
76.
go back to reference Wang Y, Zhu F, Boushey CJ, Delp EJ (2017) Weakly supervised food image segmentation using class activation maps. In: 2017 IEEE International Conference on Image Processing (ICIP), pp 1277–1281 Wang Y, Zhu F, Boushey CJ, Delp EJ (2017) Weakly supervised food image segmentation using class activation maps. In: 2017 IEEE International Conference on Image Processing (ICIP), pp 1277–1281
77.
go back to reference Wu H, Merler M, Uceda-Sosa R, Smith JR (2016) Learning to make better mistakes: Semantics-aware visual food recognition. In: Proceedings of 24th ACM international conference on Multimedia, pp 172–176 Wu H, Merler M, Uceda-Sosa R, Smith JR (2016) Learning to make better mistakes: Semantics-aware visual food recognition. In: Proceedings of 24th ACM international conference on Multimedia, pp 172–176
78.
go back to reference Xiong B, Jain SD, Grauman K (2018) Pixel objectness: learning to segment generic objects automatically in images and videos. arXiv preprint arXiv:1808.04702 Xiong B, Jain SD, Grauman K (2018) Pixel objectness: learning to segment generic objects automatically in images and videos. arXiv preprint arXiv:​1808.​04702
79.
go back to reference Yanai K, Kawano Y (2015) Food image recognition using deep convolutional network with pre-training and fine-tuning. In: 2015 IEEE International Conference on Multimedia Expo Workshops (ICMEW), pp 1–6 Yanai K, Kawano Y (2015) Food image recognition using deep convolutional network with pre-training and fine-tuning. In: 2015 IEEE International Conference on Multimedia Expo Workshops (ICMEW), pp 1–6
80.
go back to reference Zhu F, Bosch M, Khanna N, Boushey CJ, Delp EJ (2015) Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE J Biomed Health Inf 19(1):377–388CrossRef Zhu F, Bosch M, Khanna N, Boushey CJ, Delp EJ (2015) Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE J Biomed Health Inf 19(1):377–388CrossRef
Metadata
Title
Benchmarking algorithms for food localization and semantic segmentation
Authors
Sinem Aslan
Gianluigi Ciocca
Davide Mazzini
Raimondo Schettini
Publication date
24-06-2020
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 12/2020
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-020-01153-z

Other articles of this Issue 12/2020

International Journal of Machine Learning and Cybernetics 12/2020 Go to the issue