nach oben

Machine Vision and Applications

Erschienen in:

01.05.2014 | Original Paper

When standard RANSAC is not enough: cross-media visual matching with hypothesis relevancy

verfasst von: Tal Hassner, Liav Assif, Lior Wolf

Erschienen in: Machine Vision and Applications | Ausgabe 4/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The same scene can be depicted by multiple visual media. For example, the same event can be captured by a comic image or a movie frame; the same object can be represented by a photograph or by a 3D computer graphics model. In order to extract the visual analogies that are at the heart of cross-media analysis, spatial matching is required. This matching is commonly achieved by extracting key points and scoring multiple, randomly generated mapping hypotheses. The more consensus a hypothesis can draw, the higher its score. In this paper, we go beyond the conventional set-size measure for the quality of a match and present a more general hypothesis score that attempts to reflect how likely is each hypothesized transformation to be the correct one for the matching task at hand. This is achieved by considering additional, contextual cues for the relevance of a hypothesized transformation. This context changes from one matching task to another and reflects different properties of the match, beyond the size of a consensus set. We demonstrate that by learning how to correctly score each hypothesis based on these features we are able to deal much more robustly with the challenges required to allow cross-media analysis, leading to correct matches where conventional methods fail.

Vorheriger Artikel A complete system for garment segmentation and color classification

Nächster Artikel Image forgery detection using steerable pyramid transform and local binary pattern

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Please see the project webpage for available resources, including our MATLAB functions for rendering and computing the transformations. URL: http://www.openu.ac.il/home/hassner/projects/ransaclearn.

Source: http://sketchup.google.com/3dwarehouse.

Source: http://www.minecraft.net.

Cui, X., Kim, H., Park, E., Choi, H.: Robust and accurate pattern matching in fuzzy space for fiducial mark alignment. MVA 24(3), 447–459 (2012)

Yoon, S., Scherer, M., Schreck, T., Kuijper, A.: Sketch-based 3D model retrieval using diffusion tensor fields of suggestive contours. In: ACM-MM, pp. 193–200. ACM, New York (2010)

Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with application to image analysis and automated cartography. Com. ACM 24, 381–395 (1981)

Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN: 0521540518

Choi, S., Kim, T., Yu, W.: Performance evaluation of RANSAC family. In: BMVC, pp. 1–12 (2009)

Capel, D.: An effective bail-out test for RANSAC consensus scoring. In: BMVC, pp. 629–638 (2005)

Chum, O., Matas, J.: Matching with PROSAC-progressive sample consensus. In: CVPR, vol. 1, pp. 220–226 (2005)

Matas, J., Chum, O.: Randomized RANSAC with sequential probability ratio test. In: ICCV,vol. 2, pp. 1727–1732. IEEE, New York (2005)

Chin, T., Yu, J., Suter, D.: Accelerated hypothesis generation for multi-structure data via preference analysis. IEEE Trans. Pattern Anal. Mach. Intell. 34, 625–638 (2012)

10.

Sattler, T., Leibe, B., Kobbelt, L.: SCRAMSAC: improving RANSAC’s efficiency with a spatial consistency filter. In: ICCV, pp. 2090–2097. IEEE, New York (2009)

11.

Botterill, T., Mills, S., Green, R.: Fast RANSAC hypothesis generation for essential matrix estimation. In: 2011 International Conference on Digital Image Computing Techniques and Applications (DICTA), pp. 561–566. IEEE, New York (2011)

12.

Raguram, R., Frahm, J., Pollefeys, M.: A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus. In: ECCV, pp. 500–513. (2008)

13.

Scaramuzza, D.: Performance evaluation of 1-point-RANSAC visual odometry. JFR 28, 792–811 (2011)

14.

Frahm, J., Pollefeys, M.: RANSAC for (quasi-) degenerate data (QDEGSAC). In: CVPR, vol. 1, pp. 453–460. IEEE, New York (2006)

15.

Torr, P., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. CVIU 78, 138–156 (2000)

16.

Tran, Q.H., Chin, T.J., Carneiro, G., Brown, M., Suter, D.: In defence of RANSAC for outlier rejection in deformable registration. In: ECCV, pp. 274–287 (2012)

17.

Yan, Q., Xu, Y., Yang, X.: A robust homography estimation method based on keypoint consensus and appearance similarity. In: ICME, pp. 586–591. IEEE, New York (2012)

18.

Nishida, K., Kurita, T.: RANSAC-SVM for large-scale datasets. In: ICPR, pp. 1–4. IEEE, New York (2008)

19.

Bozkurt, E., Erzin, E., Erdem, Ç., Erdem, A.: RANSAC-based training data selection for speaker state recognition. In: InterSpeech. (2011)

20.

Nishida, K., Fujiki, J., Kurita, T.: Multiple random subset-kernel learning. In: CAIP, pp. 343–350. Springer, Berlin (2011)

21.

Ukrainitz, Y., Irani, M.: Aligning sequences and actions by maximizing space-time correlations. In: ECCV, pp. 538–550 (2006)

22.

Aanæs, H., Dahl, A., Steenstrup Pedersen, K.: Interesting interest points. IJCV 97(1), 18–35 (2011)

23.

Chandrasekhar, V., Takacs, G., Chen, D., Tsai, S., Reznik, Y., Grzeszczuk, R., Girod, B.: Compressed histogram of gradients: a low-bitrate descriptor. IJCV 96(3), 384–399 (2012)

24.

Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. TPAMI 27, 1615–1630 (2005)CrossRef

25.

Arie-Nachimson, M., Basri, R.: Constructing implicit 3D shape models for pose estimation. In: ICCV, pp. 1341–1348 (2009)

26.

Glasner, D., Galun, M., Alpert, S., Basri, R., Shakhnarovich, G.: Viewpoint-aware object detection and pose estimation. In: ICCV, pp. 1275–1282. IEEE, New York (2011)

27.

Su, H., Sun, M., Fei-Fei, L., Savarese, S.: Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories. In: ICCV, pp. 213–220. IEEE, New York (2009)

28.

Prisacariu, V., Reid, I.: PWP3D: Real-time segmentation and tracking of 3D objects. In: BMVC. (2009)

29.

Sandhu, R., Dambreville, S., Yezzi, A., Tannenbaum, A.: Non-rigid 2D–3D pose estimation and 2D image segmentation. In: CVPR, pp. 786–793 (2009)

30.

Wu, C., Clipp, B., Li, X., Frahm, J., Pollefeys, M.: 3D model matching with viewpoint-invariant patches (VIP). In: CVPR, pp. 1–8 (2008)

31.

Gall, J., Rosenhahn, B., Seidel, H.: Robust pose estimation with 3D textured models. In: Advances in Image and Video Technology, Lecture Notes in Computer Science, vol. 4319, pp. 84–95 (2006)

32.

Hassner, T., Basri, R.: Example based 3D reconstruction from single 2D images. In: Beyond Patches Workshop at CVPR. (2006)

33.

Hassner, T., Basri, R.: Single view depth estimation from examples. CoRR abs/1304.3915 (2013)

34.

Hassner, T.: Viewing real-world faces in 3D. In: ICCV (2013)

35.

Stark, M., Goesele, M., Schiele, B.: Back to the future: learning shape models from 3D CAD data. In: BMVC, pp. 106.1–106.11 (2010)

36.

Liebelt, J., Schmid, C.: Multi-view object class detection with a 3D geometric model. In: CVPR, pp. 1688–1695 (2010)

37.

Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3D feature maps. In: CVPR, pp. 1–8 (2008)

38.

Fisher, S.: Statistical methods for research workers, vol. 5. Genesis Publishing Pvt Ltd, Traverse City (1932)

39.

Whitlock, M.: Combining probability from independent tests: the weighted \(z\)-method is superior to Fisher’s approach. J. Evol. Biol. 18, 1368–1373 (2005)CrossRef

40.

Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. TPAMI 19, 711–720 (1997)CrossRef

41.

Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRef

42.

Mikolajcyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 60, 63–86 (2004). http://www.robots.ox.ac.uk/~vgg/research/affine/

43.

Hassner, T., Mayzels, V., Zelnik-Manor, L.: On sifts and their scales. In: CVPR, pp. 1522–1528. IEEE, New York (2012)

44.

Van Kaick, O., Tagliasacchi, A., Sidi, O., Zhang, H., Cohen-Or, D., Wolf, L., Hamarneh, G.: Prior knowledge for part correspondence. Comput. Graph. Forum 30, 553–562 (2011)CrossRef

45.

Gu, H.Z., Lee, S.Y.: Car model recognition by utilizing symmetric property to overcome severe pose variation. MVA 24(2), 255–274 (2012)

46.

Hu, W.: Learning 3D object templates by hierarchical quantization of geometry and appearance spaces. In: CVPR, pp. 2336–2343. IEEE, New York (2012)

47.

Xiang, Y., Savarese, S.: Estimating the aspect layout of object categories. In: CVPR, pp. 3410–3417. IEEE, New York (2012)

48.

Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms. http://www.vlfeat.org/ (2008). Accessed 1 Nov 2012

49.

Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: ICCV, pp. 1–8 (2007)

50.

Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007 (2007). Accessed 1 Nov 2012

51.

Lin, W.Y., Liu, L., Matsushita, Y., Low, K.L., Liu, S.: Aligning images in the wild. In: CVPR, pp. 1–8. IEEE, New York (2012)

Titel: When standard RANSAC is not enough: cross-media visual matching with hypothesis relevancy
verfasst von: Tal Hassner
Liav Assif
Lior Wolf
Publikationsdatum: 01.05.2014
Verlag: Springer Berlin Heidelberg
Erschienen in: Machine Vision and Applications / Ausgabe 4/2014
Print ISSN: 0932-8092
Elektronische ISSN: 1432-1769
DOI: https://doi.org/10.1007/s00138-013-0571-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 4/2014

A complete system for garment segmentation and color classification

Face relighting using discriminative 2D spherical spaces for face recognition

Anisotropic diffusion algorithm based on weber local descriptor for illumination invariant face verification

Family verification based on similarity of individual family member’s facial segments

Fusing the information in visible light and near-infrared images for iris recognition

Attributed hypergraph matching on a Riemannian manifold

Premium Partner