Skip to main content
Log in

FORMS: A flexible object recognition and modelling system

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We describe a flexible object recognition and modelling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model all objects at three levels of complexity: (i) the primitives, (ii) the mid-grained shapes, which are deformations of the primitives, and (iii) objects constructed by using a grammar to join mid-grained shapes together. The deformations of the primitives can be characterized by principal component analysis or modal analysis. When doing recognition the representations of these objects are obtained in a bottom-up manner from their silhouettes by a novel method for skeleton extraction and part segmentation based on deformable circles. These representations are then matched to a database of prototypical objects to obtain a set of candidate interpretations. These interpretations are verified in a top-down process. The system is demonstrated to be stable in the presence of noise, the absence of parts, the presence of additional parts, and considerable variations in articulation and viewpoint. Finally, we describe how such a representation scheme can be automatically learnt from examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Anthorny, M. and Biggs, N. 1992. Computational Learning Theory. Cambridge University Press.

  • Ballard, D. and Brown, C. 1982. Computer Vision. Prentice-Hall Inc.

  • Biederman I. 1987. Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2):115–147.

    Google Scholar 

  • Binford, T.O. 1971. Visual perception by computer. Presented at the IEEE Syst. Sci. Cybern. Conf., Miami, Florida, Invited paper.

  • Blum H. 1973. Biological shape and visual science. J. of Theoretical Biology, 33:205–287.

    Google Scholar 

  • Blum H. and NagelR.N. 1978. Shape description using weighted symmetric axis features. Patt. Recognition, 10:167–180.

    Google Scholar 

  • Brady, J. and Asada, H. 1984. Smooth local symmetries and their implementations. Int. J. of Robotics Reg., 3(3).

  • Brooks, R. 1983. Model-based three-dimensional interpretations of two-dimensional images. IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-5(2).

  • Canny J.F. 1986. A computational approach to edge detection. IEEE Trans. Patt. Anal. Mach. Intell., PAMI-8(6):679–698.

    Google Scholar 

  • Connell, J.H. 1985. Learning shape descriptions. MIT Artificial Intelligence Laboratory, Technical Report No. 85-3.

  • Crowley, J.L. 1984. A representation for shape based on peaks and ridges in the difference of low-pass transform. IEEE Trans. Patt. Anal. Mach. Intell., PAMI-6(2).

    Google Scholar 

  • Fleck, M. 1985. Local rotational symmetries. Masters' thesis. MIT Artificial Intelligence Laboratory.

  • Grenander U., Chow Y., and Keegan D. 1991. HANDS. Springer-Verlag: New York.

    Google Scholar 

  • Grimson W.E.L. 1990. Object Recognition by Computer. MIT Press: Cambridge, Mass.

    Google Scholar 

  • Hildebrand, M. 1988. Analysis of Vertebrate Structure. 3rd edition, John Wilet and Sons, Inc.

  • Hill, A., Taylor, C.J., and Cootes, T. 1992. Object recognition by flexible template matching using genetic algorithms. Proc. ECCV-2, Genoa, Italy.

  • Huttenlocher, D.P. and Ullman, S. 1987. Object recognition using alignment. In Proc. First Int. Conf. Comput. Vision, London, UK, pp. 102–111.

  • Leyton M. 1992. Symmetry, Causality, Mind, MIT Press: Cambridge, Mass.

    Google Scholar 

  • Lindenmayer A. 1968. Mathematical models for cellular interactions in development, Part I and II, Journal of Theoretical Biology, 18:280–315.

    Google Scholar 

  • Lowe D. 1985. Perceptual Organization and Visual Recognition. Kluwer: Norwell, M.A.

    Google Scholar 

  • Mandelbrot, B. 1982. The Fractal Geometry of Nature Freeman, S.F., CA.

  • Marr, D. 1982. Vision. W.H. Freeman and Co.

  • Mjolsness, E. 1991. Bayesian Inference on Visual Grammars by Neural Nets that Optimize. Research Report YALEU/DCS/TR-854.

  • Mumford, D. 1991. Geometric methods in computer vision. Proc. SPIE-the Int. Soc. Optical Eng., San Diego,

  • Mumford, D. 1993. Pattern theory, Draft.

  • Mundy J. and Zisserman A. 1992. Geometric Invariants in Computer Vision, MIT Press: Cambridge, Mass.

    Google Scholar 

  • Navatia R. and Binford T. 1977. Description and recognition of curved objects. A.I., 8:77–98.

    CAS  PubMed  Google Scholar 

  • Ogniewicz, R. 1993. Discrete Voronoi Skeleton. Hartung-Gorre.

  • Otterloo, P.J. 1991. A Contour-Oriented Approach to Shape Analysis. Prentice Hall International Ltd.

  • Pentland A. 1986. Perceptual Organization and the Presentation of Natural Form. A.I., 28:293–331.

    Google Scholar 

  • Pentland A. and Sclaroff S. 1991. Closed-form solutions for physically based shape Modelling and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(7):715–729.

    Google Scholar 

  • Pizer, S., Oliver, W., and Bloomberg, S. 1987. Hierarchical shape description via the multi-resolution symmetric axis transform. IEEE Trans. PAMI-9(4).

  • Poggio T. and Edelman S. 1990. A network that learns to recognize 3D objects. Nature, 343:263–266.

    Google Scholar 

  • Rom, H. and Medioni, G. 1993. Hierarchical decomposition and axial shape description. IEEE Trans. PAMI-15(10).

  • Saund, E. 1990. Representation and dimensions of shape deformation. Proceedings of the Third International Conference on Computer Vision, Osaka, Japan, pp. 684–689.

  • Sclaroff, S. and Pentland, A. 1993. Modal matching for correspondence and recognition. MIT Media Lab. TR, No. 201.

  • Shvaytser H. 1990. Learnable and nonlearnable visual concepts, IEEE Trans., PAMI-12(5):459–466.

    Google Scholar 

  • Siddiqi, K., Tresness, K., and Kimia, B. 1994. Parts of Visual Form: Ecological and psychophysical aspects. Tech. Report LEMS-104, Brown University.

  • Smith, A. Plants 1984. Fractals, and formal languages, Computer Graphics, 18(3).

  • Terzopolous D., Witkin A., and Kass M. 1987. Symmetry-seeking models and 3D object recovery. Int. J. Comput. Vision, 1:211–221.

    Google Scholar 

  • Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE. Trans. Pattern Analysis and Machine Intelligence, 13(10).

  • Young, J. 1981. The Life of Vertebrates, 3rd edition, Oxford Univ. Press.

  • Yuille, A.L. 1991. Deformable templates for face recognition. J. of Cognitive Neuroscience, 3(1).

  • Zerroug, M. and Nevatia, R. 1994. The three-dimensional part-based descriptions from a real intensity image. ARPA Image Understanding Workshop, Monterey, CA.

  • Zhu S.C. and Yuille A.L. 1996. Region competition: Unifying snake/balloon, region growing and Boyes/MDL/energy for multiband image segmentation. IEEE Trans. PAMI-18(9), 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, S.C., Yuille, A.L. FORMS: A flexible object recognition and modelling system. Int J Comput Vision 20, 187–212 (1996). https://doi.org/10.1007/BF00208719

Download citation

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00208719

Keywords

Navigation