Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

1. Unsupervised Visual Learning: From Pixels to Seeing

verfasst von : Marius Leordeanu

Erschienen in: Unsupervised Learning in Space and Time

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This book is about unsupervised learning. That is one of the most challenging puzzles that we must solve and put together, piece by piece, in order to decode the secrets of intelligence. Here, we move closer to that goal by connecting classical computational models to newer deep learning ones, and then build based on some fundamental and intuitive unsupervised learning principles. We want to reduce the unsupervised learning problem to a set of essential ideas and then develop the computational tools needed to implement them in the real world. Eventually, we aim to imagine a universal unsupervised learning machine, the Visual Story Network. The book is written for young students as well as experienced researchers, engineers, and professors. It presents computational models and optimization algorithms in sufficient technical detail, while also creating and maintaining a big intuitive picture about the main subject. Different tasks, such as graph matching and clustering, feature selection, classifier learning, unsupervised object discovery and segmentation in video, teacher-student learning over multiple generations as well as recursive graph neural networks are brought together, chapter by chapter, under the same umbrella of unsupervised learning. In the current chapter, we introduce the reader to the overall story of the book, which presents a unified image of the different topics that will be presented in detail in the chapters to follow. Besides sharing that main goal of learning without human supervision, the problems and tasks presented in the book also share common computational graph models and optimization methods, such as spectral graph matching, spectral clustering, and the integer projected fixed point method. By bringing together similar mathematical formulations across different tasks, all guided by common intuitive principles towards a universal unsupervised learning system, the book invites the reader to absorb and then participate in the creation of the next generation of artificial intelligence.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications, vol 20. Siam Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications, vol 20. Siam
3.
Zurück zum Zitat Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc: Ser B (Methodol) 39(1):1–22MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc: Ser B (Methodol) 39(1):1–22MathSciNetMATH
4.
Zurück zum Zitat Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. Pattern Anal Mac Intell 24(5): Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. Pattern Anal Mac Intell 24(5):
5.
Zurück zum Zitat Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40MathSciNetCrossRef Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40MathSciNetCrossRef
6.
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231
7.
Zurück zum Zitat Day WH, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1(1):7–24CrossRef Day WH, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1(1):7–24CrossRef
8.
9.
Zurück zum Zitat Sibson R (1973) Slink: an optimally efficient algorithm for the single-link cluster method. Comput J 16(1):30–34MathSciNetCrossRef Sibson R (1973) Slink: an optimally efficient algorithm for the single-link cluster method. Comput J 16(1):30–34MathSciNetCrossRef
10.
Zurück zum Zitat Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef
11.
Zurück zum Zitat Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley
12.
Zurück zum Zitat Cheeger J (1969) A lower bound for the smallest eigenvalue of the laplacian. In: Proceedings of the Princeton conference in honor of Professor S. Bochner, pp 195–199 Cheeger J (1969) A lower bound for the smallest eigenvalue of the laplacian. In: Proceedings of the Princeton conference in honor of Professor S. Bochner, pp 195–199
13.
Zurück zum Zitat Donath WE, Hoffman AJ (1972) Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Tech Discl Bull 15(3):938–944 Donath WE, Hoffman AJ (1972) Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Tech Discl Bull 15(3):938–944
14.
Zurück zum Zitat Meila M, Shi J (2001) Learning segmentation by random walks. In: Advances in neural information processing systems, pp 873–879 Meila M, Shi J (2001) Learning segmentation by random walks. In: Advances in neural information processing systems, pp 873–879
15.
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. PAMI 22(8) Shi J, Malik J (2000) Normalized cuts and image segmentation. PAMI 22(8)
16.
Zurück zum Zitat Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: NIPS Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: NIPS
17.
Zurück zum Zitat Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley
18.
Zurück zum Zitat Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: CVPR Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: CVPR
19.
Zurück zum Zitat Leordeanu M, Collins R, Hebert M (2005) Unsupervised learning of object features from video sequences. In: IEEE computer society conference on computer vision and pattern recognition, IEEE computer society; 1999, vol 1, p 1142 Leordeanu M, Collins R, Hebert M (2005) Unsupervised learning of object features from video sequences. In: IEEE computer society conference on computer vision and pattern recognition, IEEE computer society; 1999, vol 1, p 1142
20.
Zurück zum Zitat Kwak S, Cho M, Laptev I, Ponce J, Schmid C (2015) Unsupervised object discovery and tracking in video collections. In: Proceedings of the IEEE international conference on computer vision, pp 3173–3181 Kwak S, Cho M, Laptev I, Ponce J, Schmid C (2015) Unsupervised object discovery and tracking in video collections. In: Proceedings of the IEEE international conference on computer vision, pp 3173–3181
21.
Zurück zum Zitat Liu D, Chen T (2007) A topic-motion model for unsupervised video object discovery. In: CVPR Liu D, Chen T (2007) A topic-motion model for unsupervised video object discovery. In: CVPR
22.
Zurück zum Zitat Wang L, Hua G, Sukthankar R, Xue J, Niu Z, Zheng N (2016) Video object discovery and co-segmentation with extremely weak supervision. IEEE transactions on pattern analysis and machine intelligence Wang L, Hua G, Sukthankar R, Xue J, Niu Z, Zheng N (2016) Video object discovery and co-segmentation with extremely weak supervision. IEEE transactions on pattern analysis and machine intelligence
23.
Zurück zum Zitat Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Computer vision and pattern recognition Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Computer vision and pattern recognition
24.
Zurück zum Zitat Lao D, Sundaramoorthi G (2018) Extending layered models to 3d motion. In: Proceedings of the European conference on computer vision (ECCV), pp 435–451 Lao D, Sundaramoorthi G (2018) Extending layered models to 3d motion. In: Proceedings of the European conference on computer vision (ECCV), pp 435–451
25.
Zurück zum Zitat Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Proceedings of the IEEE international conference on computer vision, pp 1777–1784 Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Proceedings of the IEEE international conference on computer vision, pp 1777–1784
26.
Zurück zum Zitat Keuper M, Andres B, Brox T (2015) Motion trajectory segmentation via minimum cost multicuts. In: Proceedings of the IEEE international conference on computer vision, pp 3271–3279 Keuper M, Andres B, Brox T (2015) Motion trajectory segmentation via minimum cost multicuts. In: Proceedings of the IEEE international conference on computer vision, pp 3271–3279
27.
Zurück zum Zitat Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: BMVC, vol 2, p 8 Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: BMVC, vol 2, p 8
28.
Zurück zum Zitat Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: Proceedings of the IEEE international conference on computer vision, pp 5085–5093 Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: Proceedings of the IEEE international conference on computer vision, pp 5085–5093
29.
Zurück zum Zitat Luiten J, Voigtlaender P, Leibe B (2018) Premvos: proposal-generation, refinement and merging for the davis challenge on video object segmentation 2018. In: The 2018 DAVIS challenge on video object segmentation-CVPR workshops Luiten J, Voigtlaender P, Leibe B (2018) Premvos: proposal-generation, refinement and merging for the davis challenge on video object segmentation 2018. In: The 2018 DAVIS challenge on video object segmentation-CVPR workshops
30.
Zurück zum Zitat Maninis KK, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) Video object segmentation without temporal information. arXiv preprint arXiv:170906031 Maninis KK, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) Video object segmentation without temporal information. arXiv preprint arXiv:​170906031
31.
Zurück zum Zitat Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for the 2017 davis challenge on video object segmentation. In: The 2017 DAVIS challenge on video object segmentation-CVPR workshops, vol 5 Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for the 2017 davis challenge on video object segmentation. In: The 2017 DAVIS challenge on video object segmentation-CVPR workshops, vol 5
32.
Zurück zum Zitat Bao L, Wu B, Liu W (2018) Cnn in mrf: video object segmentation via inference in a cnn-based higher-order spatio-temporal mrf. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5977–5986 Bao L, Wu B, Liu W (2018) Cnn in mrf: video object segmentation via inference in a cnn-based higher-order spatio-temporal mrf. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5977–5986
33.
Zurück zum Zitat Wug Oh S, Lee JY, Sunkavalli K, Joo Kim S (2018) Fast video object segmentation by reference-guided mask propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7376–7385 Wug Oh S, Lee JY, Sunkavalli K, Joo Kim S (2018) Fast video object segmentation by reference-guided mask propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7376–7385
34.
Zurück zum Zitat Cheng J, Tsai YH, Hung WC, Wang S, Yang MH (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424 Cheng J, Tsai YH, Hung WC, Wang S, Yang MH (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424
35.
Zurück zum Zitat Caelles S, Maninis KK, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230 Caelles S, Maninis KK, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230
36.
Zurück zum Zitat Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672 Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672
37.
Zurück zum Zitat Chen Y, Pont-Tuset J, Montes A, Van Gool L (2018) Blazingly fast video object segmentation with pixel-wise metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1189–1198 Chen Y, Pont-Tuset J, Montes A, Van Gool L (2018) Blazingly fast video object segmentation with pixel-wise metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1189–1198
38.
Zurück zum Zitat Song H, Wang W, Zhao S, Shen J, Lam KM (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 715–731 Song H, Wang W, Zhao S, Shen J, Lam KM (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 715–731
39.
Zurück zum Zitat Tokmakov P, Alahari K, Schmid C (2017) Learning video object segmentation with visual memory. arXiv preprint arXiv:170405737 Tokmakov P, Alahari K, Schmid C (2017) Learning video object segmentation with visual memory. arXiv preprint arXiv:​170405737
40.
Zurück zum Zitat Jain SD, Xiong B, Grauman K (2017) Fusionseg: learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv preprint arXiv:170105384 2(3):6 Jain SD, Xiong B, Grauman K (2017) Fusionseg: learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv preprint arXiv:​170105384 2(3):6
41.
Zurück zum Zitat Yang Z, Wang Q, Bertinetto L, Hu W, Bai S, Torr PH (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 931–940 Yang Z, Wang Q, Bertinetto L, Hu W, Bai S, Torr PH (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 931–940
42.
Zurück zum Zitat Wang W, Song H, Zhao S, Shen J, Zhao S, Hoi SC, Ling H (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3064–3074 Wang W, Song H, Zhao S, Shen J, Zhao S, Hoi SC, Ling H (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3064–3074
43.
Zurück zum Zitat Kulkarni TD, Gupta A, Ionescu C, Borgeaud S, Reynolds M, Zisserman A, Mnih V (2019) Unsupervised learning of object keypoints for perception and control. In: Advances in neural information processing systems, pp 10,723–10,733 Kulkarni TD, Gupta A, Ionescu C, Borgeaud S, Reynolds M, Zisserman A, Mnih V (2019) Unsupervised learning of object keypoints for perception and control. In: Advances in neural information processing systems, pp 10,723–10,733
44.
Zurück zum Zitat Minderer M, Sun C, Villegas R, Cole F, Murphy K, Lee H (2019) Unsupervised learning of object structure and dynamics from videos. NeurlPS Minderer M, Sun C, Villegas R, Cole F, Murphy K, Lee H (2019) Unsupervised learning of object structure and dynamics from videos. NeurlPS
45.
Zurück zum Zitat Thewlis J, Bilen H, Vedaldi A (2017) Unsupervised learning of object landmarks by factorized spatial embeddings. In: Proceedings of the IEEE international conference on computer vision, pp 5916–5925 Thewlis J, Bilen H, Vedaldi A (2017) Unsupervised learning of object landmarks by factorized spatial embeddings. In: Proceedings of the IEEE international conference on computer vision, pp 5916–5925
46.
Zurück zum Zitat Roufosse JM, Sharma A, Ovsjanikov M (2019) Unsupervised deep learning for structured shape matching. In: Proceedings of the IEEE international conference on computer vision, pp 1617–1627 Roufosse JM, Sharma A, Ovsjanikov M (2019) Unsupervised deep learning for structured shape matching. In: Proceedings of the IEEE international conference on computer vision, pp 1617–1627
47.
Zurück zum Zitat Leordeanu M, Sukthankar R, Hebert M (2009) Unsupervised learning for graph matching. IJCV 96(1) Leordeanu M, Sukthankar R, Hebert M (2009) Unsupervised learning for graph matching. IJCV 96(1)
48.
Zurück zum Zitat Halimi O, Litany O, Rodola E, Bronstein AM, Kimmel R (2019) Unsupervised learning of dense shape correspondence. In: The IEEE conference on computer vision and pattern recognition (CVPR) Halimi O, Litany O, Rodola E, Bronstein AM, Kimmel R (2019) Unsupervised learning of dense shape correspondence. In: The IEEE conference on computer vision and pattern recognition (CVPR)
49.
Zurück zum Zitat Vo HV, Bach F, Cho M, Han K, LeCun Y, Perez P, Ponce J (2019) Unsupervised image matching and object discovery as optimization. In: The IEEE conference on computer vision and pattern recognition (CVPR) Vo HV, Bach F, Cho M, Han K, LeCun Y, Perez P, Ponce J (2019) Unsupervised image matching and object discovery as optimization. In: The IEEE conference on computer vision and pattern recognition (CVPR)
50.
Zurück zum Zitat Pei Y, Huang F, Shi F, Zha H (2011) Unsupervised image matching based on manifold alignment. IEEE Trans Pattern Anal Mach Intell 34(8):1658–1664 Pei Y, Huang F, Shi F, Zha H (2011) Unsupervised image matching based on manifold alignment. IEEE Trans Pattern Anal Mach Intell 34(8):1658–1664
51.
Zurück zum Zitat Leordeanu M, Zanfir A, Sminchisescu C (2011) Semi-supervised learning and optimization for hypergraph matching. In: ICCV Leordeanu M, Zanfir A, Sminchisescu C (2011) Semi-supervised learning and optimization for hypergraph matching. In: ICCV
52.
Zurück zum Zitat Rezende DJ, Eslami SA, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3d structure from images. In: Advances in neural information processing systems, pp 4996–5004 Rezende DJ, Eslami SA, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3d structure from images. In: Advances in neural information processing systems, pp 4996–5004
53.
Zurück zum Zitat Cha G, Lee M, Oh S (2019) Unsupervised 3d reconstruction networks. In: International conference on computer vision Cha G, Lee M, Oh S (2019) Unsupervised 3d reconstruction networks. In: International conference on computer vision
54.
Zurück zum Zitat Nunes UM, Demiris Y (2019) Online unsupervised learning of the 3d kinematic structure of arbitrary rigid bodies. In: Proceedings of the IEEE international conference on computer vision, pp 3809–3817 Nunes UM, Demiris Y (2019) Online unsupervised learning of the 3d kinematic structure of arbitrary rigid bodies. In: Proceedings of the IEEE international conference on computer vision, pp 3809–3817
55.
Zurück zum Zitat Chen Y, Schmid C, Sminchisescu C (2019) Self-supervised learning with geometric constraints in monocular video: connecting flow, depth, and camera. In: Proceedings of the IEEE international conference on computer vision, pp 7063–7072 Chen Y, Schmid C, Sminchisescu C (2019) Self-supervised learning with geometric constraints in monocular video: connecting flow, depth, and camera. In: Proceedings of the IEEE international conference on computer vision, pp 7063–7072
56.
Zurück zum Zitat Godard C, Mac Aodha O, Firman M, Brostow GJ (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3838 Godard C, Mac Aodha O, Firman M, Brostow GJ (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3838
57.
Zurück zum Zitat Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858 Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858
58.
Zurück zum Zitat Ranjan A, Jampani V, Balles L, Kim K, Sun D, Wulff J, Black MJ (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12,240–12,249 Ranjan A, Jampani V, Balles L, Kim K, Sun D, Wulff J, Black MJ (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12,240–12,249
59.
Zurück zum Zitat Bian J, Li Z, Wang N, Zhan H, Shen C, Cheng MM, Reid I (2019) Unsupervised scale-consistent depth and ego-motion learning from monocular video. In: Advances in neural information processing systems, pp 35–45 Bian J, Li Z, Wang N, Zhan H, Shen C, Cheng MM, Reid I (2019) Unsupervised scale-consistent depth and ego-motion learning from monocular video. In: Advances in neural information processing systems, pp 35–45
60.
Zurück zum Zitat Gordon A, Li H, Jonschkowski R, Angelova A (2019) Depth from videos in the wild: unsupervised monocular depth learning from unknown cameras. arXiv preprint arXiv:190404998 Gordon A, Li H, Jonschkowski R, Angelova A (2019) Depth from videos in the wild: unsupervised monocular depth learning from unknown cameras. arXiv preprint arXiv:​190404998
61.
Zurück zum Zitat Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2018) Lego: learning edge with geometry all at once by watching videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 225–234 Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2018) Lego: learning edge with geometry all at once by watching videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 225–234
62.
Zurück zum Zitat Yang Z, Wang P, Xu W, Zhao L, Nevatia R (2018) Unsupervised learning of geometry from videos with edge-aware depth-normal consistency. In: Thirty-Second AAAI conference on artificial intelligence Yang Z, Wang P, Xu W, Zhao L, Nevatia R (2018) Unsupervised learning of geometry from videos with edge-aware depth-normal consistency. In: Thirty-Second AAAI conference on artificial intelligence
63.
Zurück zum Zitat de Sa VR (1994) Unsupervised classification learning from cross-modal environmental structure. PhD thesis, University of Rochester de Sa VR (1994) Unsupervised classification learning from cross-modal environmental structure. PhD thesis, University of Rochester
64.
Zurück zum Zitat Hu D, Nie F, Li X (2019) Deep multimodal clustering for unsupervised audiovisual learning. In: The IEEE conference on computer vision and pattern recognition (CVPR) Hu D, Nie F, Li X (2019) Deep multimodal clustering for unsupervised audiovisual learning. In: The IEEE conference on computer vision and pattern recognition (CVPR)
65.
Zurück zum Zitat Li Y, Zhu JY, Tedrake R, Torralba A (2019) Connecting touch and vision via cross-modal prediction. In: The IEEE conference on computer vision and pattern recognition (CVPR) Li Y, Zhu JY, Tedrake R, Torralba A (2019) Connecting touch and vision via cross-modal prediction. In: The IEEE conference on computer vision and pattern recognition (CVPR)
66.
Zurück zum Zitat Zhang R, Isola P, Efros AA (2017) Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: CVPR, vol 1, p 5 Zhang R, Isola P, Efros AA (2017) Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: CVPR, vol 1, p 5
67.
Zurück zum Zitat Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 653–658 Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 653–658
68.
Zurück zum Zitat He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: 2017 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1153–1158 He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: 2017 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1153–1158
69.
Zurück zum Zitat Zhao H, Gan C, Rouditchenko A, Vondrick C, McDermott J, Torralba A (2018) The sound of pixels. In: Proceedings of the European conference on computer vision (ECCV), pp 570–586 Zhao H, Gan C, Rouditchenko A, Vondrick C, McDermott J, Torralba A (2018) The sound of pixels. In: Proceedings of the European conference on computer vision (ECCV), pp 570–586
70.
Zurück zum Zitat Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, ACM, pp 41–48 Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, ACM, pp 41–48
71.
Zurück zum Zitat Koffka K (2013) Principles of Gestalt psychology. Routledge Koffka K (2013) Principles of Gestalt psychology. Routledge
72.
73.
Zurück zum Zitat Stretcu O, Leordeanu M (2015) Multiple frames matching for object discovery in video. In: BMVC, pp 186–1 Stretcu O, Leordeanu M (2015) Multiple frames matching for object discovery in video. In: BMVC, pp 186–1
74.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
75.
Zurück zum Zitat Leordeanu M, Hebert M (2005) A spectral technique for correspondence problems using pairwise constraints. In: ICCV Leordeanu M, Hebert M (2005) A spectral technique for correspondence problems using pairwise constraints. In: ICCV
76.
Zurück zum Zitat Leordeanu M, Hebert M, Sukthankar R (2009) An integer projected fixed point method for graph matching and map inference. In: NIPS Leordeanu M, Hebert M, Sukthankar R (2009) An integer projected fixed point method for graph matching and map inference. In: NIPS
77.
Zurück zum Zitat Brendel W, Todorovic S (2010) Segmentation as maximum-weight independent set. In: NIPS Brendel W, Todorovic S (2010) Segmentation as maximum-weight independent set. In: NIPS
78.
Zurück zum Zitat Jain A, Gupta A, Rodriguez M, Davis L (2013) Representing videos using mid-level discriminative patches. In: Computer vision and pattern recognition, pp 2571–2578 Jain A, Gupta A, Rodriguez M, Davis L (2013) Representing videos using mid-level discriminative patches. In: Computer vision and pattern recognition, pp 2571–2578
79.
Zurück zum Zitat Semenovich D (2010) Tensor power method for efficient map inference in higher-order mrfs. In: ICPR Semenovich D (2010) Tensor power method for efficient map inference in higher-order mrfs. In: ICPR
80.
Zurück zum Zitat Monroy A, Bell P, Ommer B (2014) Morphological analysis for investigating artistic images. Image Visi Comput 32(6) Monroy A, Bell P, Ommer B (2014) Morphological analysis for investigating artistic images. Image Visi Comput 32(6)
81.
Zurück zum Zitat Leordeanu M, Sminchisescu C (2012) Efficient hypergraph clustering. In: International conference on artificial intelligence and statistics Leordeanu M, Sminchisescu C (2012) Efficient hypergraph clustering. In: International conference on artificial intelligence and statistics
82.
Zurück zum Zitat Leordeanu M, Radu A, Baluja S, Sukthankar R (2015) Labeling the features not the samples: efficient video classification with minimal supervision. arXiv preprint arXiv:151200517 Leordeanu M, Radu A, Baluja S, Sukthankar R (2015) Labeling the features not the samples: efficient video classification with minimal supervision. arXiv preprint arXiv:​151200517
83.
Zurück zum Zitat Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: The IEEE international conference on computer vision (ICCV) Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: The IEEE international conference on computer vision (ICCV)
84.
Zurück zum Zitat Haller E, Florea AM, Leordeanu M (2019) Spacetime graph optimization for video object segmentation. arXiv preprint arXiv:190703326 Haller E, Florea AM, Leordeanu M (2019) Spacetime graph optimization for video object segmentation. arXiv preprint arXiv:​190703326
85.
Zurück zum Zitat Besag J (1986) On the statistical analysis of dirty pictures. J Roy Stat Soc 48(5):259–302 Besag J (1986) On the statistical analysis of dirty pictures. J Roy Stat Soc 48(5):259–302
86.
87.
Zurück zum Zitat Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics. Wiley Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics. Wiley
88.
Zurück zum Zitat Cour T, Shi J, Gogin N (2005) Learning spectral graph segmentation. In: International conference on artificial intelligence and statistics Cour T, Shi J, Gogin N (2005) Learning spectral graph segmentation. In: International conference on artificial intelligence and statistics
89.
Zurück zum Zitat Ding C, Li T, Jordan M (2008) Nonnegative matrix factorization of combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE international conference on data mining Ding C, Li T, Jordan M (2008) Nonnegative matrix factorization of combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE international conference on data mining
90.
Zurück zum Zitat Motzkin T, Straus E (1965) Maxima for graphs and a new proof of a theorem of turan. Canad J Math Motzkin T, Straus E (1965) Maxima for graphs and a new proof of a theorem of turan. Canad J Math
91.
Zurück zum Zitat Bulo S, Pellilo M (2009) A game-theoretic approach to hypergraph clustering. In: NIPS Bulo S, Pellilo M (2009) A game-theoretic approach to hypergraph clustering. In: NIPS
92.
Zurück zum Zitat Liu H, Latecki L, Yan S (2010) Robust clustering as ensembles of affinity relations. In: NIPS Liu H, Latecki L, Yan S (2010) Robust clustering as ensembles of affinity relations. In: NIPS
93.
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM multimedia Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM multimedia
94.
Zurück zum Zitat Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In: CVPR Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In: CVPR
95.
Zurück zum Zitat Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202CrossRef Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202CrossRef
96.
Zurück zum Zitat Meila M, Shi J (2001) A random walks view of spectral segmentation. In: AISTATS Meila M, Shi J (2001) A random walks view of spectral segmentation. In: AISTATS
97.
Zurück zum Zitat Leordeanu M, Sukthankar R, Hebert M (2012) Unsupervised learning for graph matching. Int J Comput Vis 96:28–45MathSciNetCrossRef Leordeanu M, Sukthankar R, Hebert M (2012) Unsupervised learning for graph matching. Int J Comput Vis 96:28–45MathSciNetCrossRef
98.
Zurück zum Zitat Croitoru I, Bogolin SV, Leordeanu M (2017) Unsupervised learning from video to detect foreground objects in single images. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 4345–4353 Croitoru I, Bogolin SV, Leordeanu M (2017) Unsupervised learning from video to detect foreground objects in single images. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 4345–4353
99.
Zurück zum Zitat Croitoru I, Bogolin SV, Leordeanu M (2019) Unsupervised learning of foreground object segmentation. Int J Comput Vis:1–24 Croitoru I, Bogolin SV, Leordeanu M (2019) Unsupervised learning of foreground object segmentation. Int J Comput Vis:1–24
Metadaten
Titel
Unsupervised Visual Learning: From Pixels to Seeing
verfasst von
Marius Leordeanu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-42128-1_1

Premium Partner