nach oben

International Journal of Computer Vision

Erschienen in:

01.02.2014

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

verfasst von: Kun Liu, Henrik Skibbe, Thorsten Schmidt, Thomas Blein, Klaus Palme, Thomas Brox, Olaf Ronneberger

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The histogram of oriented gradients (HOG) is widely used for image description and proves to be very effective. In many vision problems, rotation-invariant analysis is necessary or preferred. Popular solutions are mainly based on pose normalization or learning, neglecting some intrinsic properties of rotations. This paper presents a method to build rotation-invariant HOG descriptors using Fourier analysis in polar/spherical coordinates, which are closely related to the irreducible representation of the 2D/3D rotation groups. This is achieved by considering a gradient histogram as a continuous angular signal which can be well represented by the Fourier basis (2D) or spherical harmonics (3D). As rotation-invariance is established in an analytical way, we can avoid discretization artifacts and create a continuous mapping from the image to the feature space. In the experiments, we first show that our method outperforms the state-of-the-art in a public dataset for a car detection task in aerial images. We further use the Princeton Shape Benchmark and the SHREC 2009 Generic Shape Benchmark to demonstrate the high performance of our method for similarity measures of 3D shapes. Finally, we show an application on microscopic volumetric data.

Vorheriger Artikel Demisting the Hough Transform for 3D Shape Recognition and Registration

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

In this paper, a quantity that describes certain image content is generally called a feature; a single gradient histogram computed in a local patch is referred to as a HOG cell; an assembled feature vector that describes a region of multiple cells is referred to as a HOG descriptor.

The property in Eq.(6) has also been referred to as equivariance in some works (Reisert and Burkhardt 2008; Vedaldi et al. 2011).

In this paper, we do not rely on this polar tensor concept, because we do not need any special mathematical tools for the related analysis of 2D images.

We purposely define the expansion coefficients with a conjugation, which makes it a standard inner product between the coefficients and SH basis. The same convention is used in Reisert and Burkhardt (2009). The advantage is that this linear expansion can be understood as a coupling between two spherical tensors, which will be explained later.

This operator is written as \(\circ _\ell \) in Reisert and Burkhardt (2009), since \(\ell _1, \ell _2\) can be inferred from the two coupled tensors. In this paper we use the more explicit notation \({\otimes }_{(\ell |\ell _1,\ell _2)}\).

http://lmb.informatik.uni-freiburg.de/resources/opensource/FourierHOG/

The coupling used here is only a portion of all possible combinations. We prefer these simple choices since we only want to demonstrate the description power of the proposed method. We believe that the optimal feature selection is application-dependent. Using a classifier like linear SVM or Random Forest, which have built-in feature selection ability, allows to increase the dimensionality of the feature vector by adding more coupled features.

Patrick Min, https://www.google.com/search?q=binvox

We created the ground-truth by editing a watershed segmentation result manually. Some very badly segmented regions were discarded and were not used for training.

Ahonen, T., Matas, J., He, C., Pietikäinen, M. (2009). Rotation invariant image description with local binary pattern histogram Fourier features. In Scandinavian Conference on Image, Analysis, pp. 61–70.

Akgül, C., Axenopoulos, A., Bustos, B., Chaouch, M., Daras, P., Dutagaci, H., Furuya, T., Godil, A., Kreft, S., Lian, Z., et al. (2009). SHREC 2009-Generic Shape Retrieval contest. In Eurographics workshop on 3D object retrieval.

Allaire, S., Kim, J., Breen, S., Jaffray, D., & Pekar, V. (2008). Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis. In CVPR Workshops.

Arsenault, H., & Sheng, Y. (1986). Properties of the circular harmonic expansion for rotation-invariant pattern recognition. Applied Optics, 25(18), 3225–3229.CrossRef

Bendale, P., Triggs, B., & Kingsbury, N. (2010). Multiscale keypoint analysis based on complex wavelets. In British Machine Vision Conference, pp. 49(1–49), 10.

Bourdev, L., Malik, J. (2009). Poselets: Body part detectors trained using 3D human pose annotations. In International Conference on Computer Vision, pp. 1365–1372.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRefMATH

Brink, D., & Satchler, G. (1968). Angular momentum. Oxford: Clarendon Press.

Bülow, T. (2004). Spherical diffusion for 3D surface smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(12), 1650–1654.CrossRef

Burkhardt, H., & Siggelkow, S. (2001). Invariant features in pattern recognition—fundamentals and applications. In C. Kotropoulos & I. Pitas (Eds.), Nonlinear model-based image/video processing and analysis (pp. 269–307). New York: Wiley.

Chang, C.-C., Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2,27:1–27:27. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.MATH

Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 886–893.

Driscoll, J., & Healy, D. (1994). Computing Fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics, 15(2), 202–250.CrossRefMATHMathSciNet

Fan, R., Chang, K., Hsieh, C., Wang, X., & Lin, C. (2008). LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research, 9, 1871–1874.MATH

Fehr, J. (2010). Local rotation invariant patch descriptors for 3D vector fields. In International Conference on, Pattern Recognition, pp. 1381–1384.

Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.CrossRef

Flitton, G., Breckon, T., & Megherbi, N. (2010). Object recognition using 3D SIFT in complex CT volumes. In British Machine Vision Conference, pp. 11(1–11), 12.

Fornasier, M., & Toniolo, D. (2005). Fast, robust and efficient 2D pattern recognition for re-assembling fragmented images. Pattern Recognition, 38(11), 2074–2087.CrossRef

Förstner, W., Gülch, E. (1987). A fast operator for detection and precise location of distinct points, corners and centres of circular features. In ISPRS intercommission conference on fast processing of photogrammetric data, pp. 281–305.

Freeman, W., & Adelson, E. (1991). The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9), 891–906.CrossRef

Gauglitz, S. (2011). Improving keypoint orientation assignment. In British Machine Vision Conference, pp. 93(1–93), 11.

Giannakis, G. (1989). Signal reconstruction from multiple correlations: frequency- and time-domain approaches. Journal of Optical Society of America A, 6(5), 682–697.CrossRef

Golub, G., & Van Loan, C. (1996). Matrix computations. Baltimore: Johns Hopkins Univ Press.MATH

Green, R. (2003). Spherical harmonic lighting: The gritty details. In Game Developers Conference, 2, 2–3.

Haasdonk, B., & Burkhardt, H. (2007). Invariant kernel functions for pattern analysis and machine learning. Machine Learning, 68(1), 35–61.CrossRef

Heitz, G., Koller, D. (2008). Learning spatial context: Using stuff to find things. In European Conference on Computer Vision, pp. 30–43.

Jacovitti, G., & Neri, A. (2000). Multiresolution circular harmonic decomposition. IEEE Transaction on Signal Processing, 48(11), 3242–3247.CrossRefMathSciNet

Kavukcuoglu, K., Ranzato, M., Fergus, R., Le-Cun, Y. (2009). Learning invariant features through topographic filter maps. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1605–1612.

Kazhdan, M., Funkhouser, T., Rusinkiewicz, S. (2003). Rotation invariant spherical harmonic representation of 3D shape descriptors. In Eurographics/ACM SIGGRAPH symposium on Geometry processing, pp. 156–164.

Kläser, A., Marszałek, M., Schmid, C. (2008). A spatio-temporal descriptor based on 3D-gradients. In British Machine Vision Conference, pp. 995–1004.

Knopp, J., Prasad, M., Van Gool, L. (2010a). Orientation invariant 3D object classification using Hough transform based methods. In ACM Multimedia, Workshop, pp. 15–20.

Knopp, J., Prasad, M., Willems, G., Timofte, R., Van Gool, L. (2010b). Hough transform and 3D SURF for robust three dimensional classification. In European Conference on Computer Vision, pp. 589–602.

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRef

Lenz, R. (1990). Group theoretical methods in image processing. Berlin: Springer.CrossRef

Lin, W., Liu, L., Matsushita, Y., Low, K., Liu, S. (2012). Aligning images in the wild. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1–8.

Liu, K., Skibbe, H., Schmidt, T., Blein, T., Palme, K., & Ronneberger, O. (2011). 3D rotation-invariant description from tensor operation on spherical HOG field. In British Machine Vision Conference, pp. 33(1-33), 12.

Liu, K., Wang, Q., Driever, W., Ronneberger, O. (2012). 2D/3D Rotation-invariant Detection using Equivariant Filters and Kernel Weighted Mapping. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 917–924.

Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRef

Makadia, A., & Daniilidis, K. (2010). Spherical correlation of visual representations for 3D model retrieval. International Journal of Computer Vision, 89(2), 193–210.CrossRef

Memisevic, R., & Hinton, G. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.CrossRefMATH

Özuysal, M., Calonder, M., Lepetit, V., & Fua, P. (2010). Fast keypoint recognition using random ferns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.CrossRef

Ponce, C., & Singer, A. (2011). Computing steerable principal components of a large set of images and their rotations. IEEE Transactions on Image Processing, 20(11), 3051–3062.CrossRefMathSciNet

Reisert, M., & Burkhardt, H. (2008a). Efficient tensor voting with 3D tensorial harmonics. In CVPR Workshops.

Reisert, M., & Burkhardt, H. (2008b). Equivariant holomorphic filters for contour denoising and rapid object detection. IEEE Transactions on Image Processing, 17(2), 190–203.CrossRefMathSciNet

Reisert M., Burkhardt H. (2009) Spherical Tensor Calculus for Local Adaptive Filtering. In: Aja-Fernández S., de Luis García R., Tao D., Li X. (eds) Tensors in Image Processing and Computer Vision Advances in Pattern Recognition. Springer, USA, pp. 153–178.

Ronneberger, O., Burkhardt, H., & Schultz, E. (2002). General-purpose Object Recognition in 3D Volume Data Sets using Gray-Scale Invariants—Classification of Airborne Pollen-Grains Recorded with a Confocal Laser Scanning Microscope. In International Conference on Pattern Recognition, 2, 290–295.

Ronneberger, O., Liu, K., Rath, M., Ruess, D., Mueller, T., Skibbe, H., et al. (2012). ViBE-Z: a framework for 3D virtual colocalization analysis in zebrafish larval brains. Nature Methods, 9(7), 735–742.CrossRef

Ronneberger, O., Wang, Q., & Burkhardt, H. (2007). 3D Invariants with High Robustness to Local Deformations for Automated Pollen Recognition (pp. 455–435). Pattern recognition: In DAGM conference on.

Rose, M. (1957). Elementary theory of angular momentum. New York: Wiley.MATH

Scherer, M., Walter, M., & Schreck, T. (2010). Histograms of Oriented Gradients for 3D Model Retrieval (pp. 41–48). Visualization and Computer Vision: In International Conference in Central Europe on Computer Graphics.

Schmidt, T., Keuper, M., Pasternak, T., Palme, K., & Ronneberger, O. (2012). Modeling of Sparsely Sampled Tubular Surfaces Using Coupled Curves (pp. 83–92). Pattern recognition: In DAGM conference on.

Schmidt, U., Roth, S. (2012). Learning rotation-aware features: From invariant priors to equivariant descriptors. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 2050–2057.

Schultz, T., Weickert, J., & Seidel, H. (2009). A higher-order structure tensor. In D. Laidlaw & J. Weickert (Eds.), Visualization and processing of tensor fields (pp. 263–279). Berlin: Springer.CrossRef

Sheng, Y., & Arsenault, H. (1986). Experiments on pattern recognition using invariant Fourier-Mellin descriptors. Journal of Optical Society of America A, 3(6), 771–776.CrossRef

Shilane, P., Min, P., Kazhdan, M., Funkhouser, T. (2004). The Princeton Shape Benchmark. In International Conference on Shape Modeling and Applications, pp. 167–178.

Skibbe, H., & Reisert, M. (2012). Circular Fourier-HOG features for rotation invariant object detection in biomedical images. In IEEE International Symposium on Biomedical Imaging, pp. 450–453.

Skibbe, H., Reisert, M., & Burkhardt, H. (2011). SHOG-spherical HOG descriptors for rotation invariant 3D object detection. In DAGM conference on Pattern recognition, pp. 142–151.

Skibbe, H., Reisert, M., Ronneberger, O., & Burkhardt, H. (2009). Increasing the dimension of creativity in rotation invariant feature design using 3D tensorial harmonics. In DAGM conference on Pattern recognition, pp. 141–150.

Skibbe, H., Reisert, M., Schmidt, T., Brox, T., Ronneberger, O., Burkhardt, H. (2012). Fast rotation invariant 3D feature computation utilizing efficient local neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(8):1563–1575. Software available at https://bitbucket.org/skibbe/sta-imagetoolbox

Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., Girod, B. (2010). Unified real-time tracking and recognition with rotation-invariant fast features. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 934–941.

Vedaldi, A., Blaschko, M., Zisserman, A. (2011). Learning equivariant structured output SVM regressors. In International Conference on Computer Vision, pp. 959–966.

Villamizar, M., Moreno-Noguer, F., Andrade-Cetto, J., Sanfeliu, A. (2010). Efficient rotation invariant object detection using boosted random ferns. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1038–1045.

Wang, Q., Ronneberger, O., & Burkhardt, H. (2009). Rotational invariance based on fourier analysis in polar and spherical coordinates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1715–1722.CrossRefMATH

Wolberg, G., Zokai, S. (2000). Robust image registration using log-polar transform. In IEEE International Conference on Image Processing, pp. 493–496.

Titel: Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates
verfasst von: Kun Liu
Henrik Skibbe
Thorsten Schmidt
Thomas Blein
Klaus Palme
Thomas Brox
Olaf Ronneberger
Publikationsdatum: 01.02.2014
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 3/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-013-0634-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 3/2014

Detecting People Looking at Each Other in Videos

Branch&Rank for Efficient Object Detection

Generative Methods for Long-Term Place Recognition in Dynamic Scenes

Regressing Local to Global Shape Properties for Online Segmentation and Tracking

Object and Action Classification with Latent Window Parameters

Demisting the Hough Transform for 3D Shape Recognition and Registration