nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

1. Unsupervised Visual Learning: From Pixels to Seeing

verfasst von : Marius Leordeanu

Erschienen in: Unsupervised Learning in Space and Time

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This book is about unsupervised learning. That is one of the most challenging puzzles that we must solve and put together, piece by piece, in order to decode the secrets of intelligence. Here, we move closer to that goal by connecting classical computational models to newer deep learning ones, and then build based on some fundamental and intuitive unsupervised learning principles. We want to reduce the unsupervised learning problem to a set of essential ideas and then develop the computational tools needed to implement them in the real world. Eventually, we aim to imagine a universal unsupervised learning machine, the Visual Story Network. The book is written for young students as well as experienced researchers, engineers, and professors. It presents computational models and optimization algorithms in sufficient technical detail, while also creating and maintaining a big intuitive picture about the main subject. Different tasks, such as graph matching and clustering, feature selection, classifier learning, unsupervised object discovery and segmentation in video, teacher-student learning over multiple generations as well as recursive graph neural networks are brought together, chapter by chapter, under the same umbrella of unsupervised learning. In the current chapter, we introduce the reader to the overall story of the book, which presents a unified image of the different topics that will be presented in detail in the chapters to follow. Besides sharing that main goal of learning without human supervision, the problems and tasks presented in the book also share common computational graph models and optimization methods, such as spectral graph matching, spectral clustering, and the integer projected fixed point method. By bringing together similar mathematical formulations across different tasks, all guided by common intuitive principles towards a universal unsupervised learning system, the book invites the reader to absorb and then participate in the creation of the next generation of artificial intelligence.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Unsupervised Learning of Graph and Hypergraph Matching

Code available at: https://sites.google.com/site/multipleframesmatching/.

Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications, vol 20. Siam

Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137MathSciNetCrossRef

Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc: Ser B (Methodol) 39(1):1–22MathSciNetMATH

Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. Pattern Anal Mac Intell 24(5):

Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40MathSciNetCrossRef

Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231

Day WH, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1(1):7–24CrossRef

Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244MathSciNetCrossRef

Sibson R (1973) Slink: an optimally efficient algorithm for the single-link cluster method. Comput J 16(1):30–34MathSciNetCrossRef

10.

Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef

11.

Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley

12.

Cheeger J (1969) A lower bound for the smallest eigenvalue of the laplacian. In: Proceedings of the Princeton conference in honor of Professor S. Bochner, pp 195–199

13.

Donath WE, Hoffman AJ (1972) Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Tech Discl Bull 15(3):938–944

14.

Meila M, Shi J (2001) Learning segmentation by random walks. In: Advances in neural information processing systems, pp 873–879

15.

Shi J, Malik J (2000) Normalized cuts and image segmentation. PAMI 22(8)

16.

Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: NIPS

17.

Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley

18.

Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: CVPR

19.

Leordeanu M, Collins R, Hebert M (2005) Unsupervised learning of object features from video sequences. In: IEEE computer society conference on computer vision and pattern recognition, IEEE computer society; 1999, vol 1, p 1142

20.

Kwak S, Cho M, Laptev I, Ponce J, Schmid C (2015) Unsupervised object discovery and tracking in video collections. In: Proceedings of the IEEE international conference on computer vision, pp 3173–3181

21.

Liu D, Chen T (2007) A topic-motion model for unsupervised video object discovery. In: CVPR

22.

Wang L, Hua G, Sukthankar R, Xue J, Niu Z, Zheng N (2016) Video object discovery and co-segmentation with extremely weak supervision. IEEE transactions on pattern analysis and machine intelligence

23.

Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Computer vision and pattern recognition

24.

Lao D, Sundaramoorthi G (2018) Extending layered models to 3d motion. In: Proceedings of the European conference on computer vision (ECCV), pp 435–451

25.

Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Proceedings of the IEEE international conference on computer vision, pp 1777–1784

26.

Keuper M, Andres B, Brox T (2015) Motion trajectory segmentation via minimum cost multicuts. In: Proceedings of the IEEE international conference on computer vision, pp 3271–3279

27.

Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: BMVC, vol 2, p 8

28.

Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: Proceedings of the IEEE international conference on computer vision, pp 5085–5093

29.

Luiten J, Voigtlaender P, Leibe B (2018) Premvos: proposal-generation, refinement and merging for the davis challenge on video object segmentation 2018. In: The 2018 DAVIS challenge on video object segmentation-CVPR workshops

30.

Maninis KK, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) Video object segmentation without temporal information. arXiv preprint arXiv:170906031

31.

Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for the 2017 davis challenge on video object segmentation. In: The 2017 DAVIS challenge on video object segmentation-CVPR workshops, vol 5

32.

Bao L, Wu B, Liu W (2018) Cnn in mrf: video object segmentation via inference in a cnn-based higher-order spatio-temporal mrf. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5977–5986

33.

Wug Oh S, Lee JY, Sunkavalli K, Joo Kim S (2018) Fast video object segmentation by reference-guided mask propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7376–7385

34.

Cheng J, Tsai YH, Hung WC, Wang S, Yang MH (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424

35.

Caelles S, Maninis KK, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230

36.

Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672

37.

Chen Y, Pont-Tuset J, Montes A, Van Gool L (2018) Blazingly fast video object segmentation with pixel-wise metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1189–1198

38.

Song H, Wang W, Zhao S, Shen J, Lam KM (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 715–731

39.

Tokmakov P, Alahari K, Schmid C (2017) Learning video object segmentation with visual memory. arXiv preprint arXiv:170405737

40.

Jain SD, Xiong B, Grauman K (2017) Fusionseg: learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv preprint arXiv:170105384 2(3):6

41.

Yang Z, Wang Q, Bertinetto L, Hu W, Bai S, Torr PH (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 931–940

42.

Wang W, Song H, Zhao S, Shen J, Zhao S, Hoi SC, Ling H (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3064–3074

43.

Kulkarni TD, Gupta A, Ionescu C, Borgeaud S, Reynolds M, Zisserman A, Mnih V (2019) Unsupervised learning of object keypoints for perception and control. In: Advances in neural information processing systems, pp 10,723–10,733

44.

Minderer M, Sun C, Villegas R, Cole F, Murphy K, Lee H (2019) Unsupervised learning of object structure and dynamics from videos. NeurlPS

45.

Thewlis J, Bilen H, Vedaldi A (2017) Unsupervised learning of object landmarks by factorized spatial embeddings. In: Proceedings of the IEEE international conference on computer vision, pp 5916–5925

46.

Roufosse JM, Sharma A, Ovsjanikov M (2019) Unsupervised deep learning for structured shape matching. In: Proceedings of the IEEE international conference on computer vision, pp 1617–1627

47.

Leordeanu M, Sukthankar R, Hebert M (2009) Unsupervised learning for graph matching. IJCV 96(1)

48.

Halimi O, Litany O, Rodola E, Bronstein AM, Kimmel R (2019) Unsupervised learning of dense shape correspondence. In: The IEEE conference on computer vision and pattern recognition (CVPR)

49.

Vo HV, Bach F, Cho M, Han K, LeCun Y, Perez P, Ponce J (2019) Unsupervised image matching and object discovery as optimization. In: The IEEE conference on computer vision and pattern recognition (CVPR)

50.

Pei Y, Huang F, Shi F, Zha H (2011) Unsupervised image matching based on manifold alignment. IEEE Trans Pattern Anal Mach Intell 34(8):1658–1664

51.

Leordeanu M, Zanfir A, Sminchisescu C (2011) Semi-supervised learning and optimization for hypergraph matching. In: ICCV

52.

Rezende DJ, Eslami SA, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3d structure from images. In: Advances in neural information processing systems, pp 4996–5004

53.

Cha G, Lee M, Oh S (2019) Unsupervised 3d reconstruction networks. In: International conference on computer vision

54.

Nunes UM, Demiris Y (2019) Online unsupervised learning of the 3d kinematic structure of arbitrary rigid bodies. In: Proceedings of the IEEE international conference on computer vision, pp 3809–3817

55.

Chen Y, Schmid C, Sminchisescu C (2019) Self-supervised learning with geometric constraints in monocular video: connecting flow, depth, and camera. In: Proceedings of the IEEE international conference on computer vision, pp 7063–7072

56.

Godard C, Mac Aodha O, Firman M, Brostow GJ (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3838

57.

Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858

58.

Ranjan A, Jampani V, Balles L, Kim K, Sun D, Wulff J, Black MJ (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12,240–12,249

59.

Bian J, Li Z, Wang N, Zhan H, Shen C, Cheng MM, Reid I (2019) Unsupervised scale-consistent depth and ego-motion learning from monocular video. In: Advances in neural information processing systems, pp 35–45

60.

Gordon A, Li H, Jonschkowski R, Angelova A (2019) Depth from videos in the wild: unsupervised monocular depth learning from unknown cameras. arXiv preprint arXiv:190404998

61.

Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2018) Lego: learning edge with geometry all at once by watching videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 225–234

62.

Yang Z, Wang P, Xu W, Zhao L, Nevatia R (2018) Unsupervised learning of geometry from videos with edge-aware depth-normal consistency. In: Thirty-Second AAAI conference on artificial intelligence

63.

de Sa VR (1994) Unsupervised classification learning from cross-modal environmental structure. PhD thesis, University of Rochester

64.

Hu D, Nie F, Li X (2019) Deep multimodal clustering for unsupervised audiovisual learning. In: The IEEE conference on computer vision and pattern recognition (CVPR)

65.

Li Y, Zhu JY, Tedrake R, Torralba A (2019) Connecting touch and vision via cross-modal prediction. In: The IEEE conference on computer vision and pattern recognition (CVPR)

66.

Zhang R, Isola P, Efros AA (2017) Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: CVPR, vol 1, p 5

67.

Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 653–658

68.

He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: 2017 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1153–1158

69.

Zhao H, Gan C, Rouditchenko A, Vondrick C, McDermott J, Torralba A (2018) The sound of pixels. In: Proceedings of the European conference on computer vision (ECCV), pp 570–586

70.

Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, ACM, pp 41–48

71.

Koffka K (2013) Principles of Gestalt psychology. Routledge

72.

Rock I, Palmer S (1990) Gestalt psychology. Sci Am 263:84–90CrossRef

73.

Stretcu O, Leordeanu M (2015) Multiple frames matching for object discovery in video. In: BMVC, pp 186–1

74.

Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

75.

Leordeanu M, Hebert M (2005) A spectral technique for correspondence problems using pairwise constraints. In: ICCV

76.

Leordeanu M, Hebert M, Sukthankar R (2009) An integer projected fixed point method for graph matching and map inference. In: NIPS

77.

Brendel W, Todorovic S (2010) Segmentation as maximum-weight independent set. In: NIPS

78.

Jain A, Gupta A, Rodriguez M, Davis L (2013) Representing videos using mid-level discriminative patches. In: Computer vision and pattern recognition, pp 2571–2578

79.

Semenovich D (2010) Tensor power method for efficient map inference in higher-order mrfs. In: ICPR

80.

Monroy A, Bell P, Ommer B (2014) Morphological analysis for investigating artistic images. Image Visi Comput 32(6)

81.

Leordeanu M, Sminchisescu C (2012) Efficient hypergraph clustering. In: International conference on artificial intelligence and statistics

82.

Leordeanu M, Radu A, Baluja S, Sukthankar R (2015) Labeling the features not the samples: efficient video classification with minimal supervision. arXiv preprint arXiv:151200517

83.

Haller E, Leordeanu M (2017) Unsupervised object segmentation in video by efficient selection of highly probable positive features. In: The IEEE international conference on computer vision (ICCV)

84.

Haller E, Florea AM, Leordeanu M (2019) Spacetime graph optimization for video object segmentation. arXiv preprint arXiv:190703326

85.

Besag J (1986) On the statistical analysis of dirty pictures. J Roy Stat Soc 48(5):259–302

86.

Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logistics Q 3(1–2):95–110MathSciNetCrossRef

87.

Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics. Wiley

88.

Cour T, Shi J, Gogin N (2005) Learning spectral graph segmentation. In: International conference on artificial intelligence and statistics

89.

Ding C, Li T, Jordan M (2008) Nonnegative matrix factorization of combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE international conference on data mining

90.

Motzkin T, Straus E (1965) Maxima for graphs and a new proof of a theorem of turan. Canad J Math

91.

Bulo S, Pellilo M (2009) A game-theoretic approach to hypergraph clustering. In: NIPS

92.

Liu H, Latecki L, Yan S (2010) Robust clustering as ensembles of affinity relations. In: NIPS

93.

Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM multimedia

94.

Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In: CVPR

95.

Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202CrossRef

96.

Meila M, Shi J (2001) A random walks view of spectral segmentation. In: AISTATS

97.

Leordeanu M, Sukthankar R, Hebert M (2012) Unsupervised learning for graph matching. Int J Comput Vis 96:28–45MathSciNetCrossRef

98.

Croitoru I, Bogolin SV, Leordeanu M (2017) Unsupervised learning from video to detect foreground objects in single images. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 4345–4353

99.

Croitoru I, Bogolin SV, Leordeanu M (2019) Unsupervised learning of foreground object segmentation. Int J Comput Vis:1–24

Titel: Unsupervised Visual Learning: From Pixels to Seeing
verfasst von: Marius Leordeanu
Verlag: Springer International Publishing
Buch: Unsupervised Learning in Space and Time
Print ISBN: 978-3-030-42127-4

Electronic ISBN: 978-3-030-42128-1

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-42128-1_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner