research-article

Creating consistent scene graphs using a probabilistic grammar

Authors:
Tianqiang Liu

Princeton University

Princeton University
View Profile

,
Siddhartha Chaudhuri

Princeton University and Cornell University

Princeton University and Cornell University
View Profile

,
Vladimir G. Kim

Stanford University

Stanford University
View Profile

,
Qixing Huang

Stanford University and Toyota Technological Institute at Chicago

Stanford University and Toyota Technological Institute at Chicago
View Profile

,
Niloy J. Mitra

University College London

University College London
View Profile

,
Thomas Funkhouser

Princeton University

Princeton University
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 33 Issue 6Article No.: 211pp 1–12https://doi.org/10.1145/2661229.2661243

Published:19 November 2014Publication History

ACM Transactions on Graphics

Abstract

Growing numbers of 3D scenes in online repositories provide new opportunities for data-driven scene understanding, editing, and synthesis. Despite the plethora of data now available online, most of it cannot be effectively used for data-driven applications because it lacks consistent segmentations, category labels, and/or functional groupings required for co-analysis. In this paper, we develop algorithms that infer such information via parsing with a probabilistic grammar learned from examples. First, given a collection of scene graphs with consistent hierarchies and labels, we train a probabilistic hierarchical grammar to represent the distributions of shapes, cardinalities, and spatial relationships of semantic objects within the collection. Then, we use the learned grammar to parse new scenes to assign them segmentations, labels, and hierarchies consistent with the collection. During experiments with these algorithms, we find that: they work effectively for scene graphs for indoor scenes commonly found online (bedrooms, classrooms, and libraries); they outperform alternative approaches that consider only shape similarities and/or spatial relationships without hierarchy; they require relatively small sets of training data; they are robust to moderate over-segmentation in the inputs; and, they can robustly transfer labels from one data set to another. As a result, the proposed algorithms can be used to provide consistent hierarchies for large collections of scenes within the same semantic class.

References

Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc. Google ScholarDigital Library
Bokeloh, M., Wand, M., and Seidel, H.-P. 2010. A connection between partial symmetry and inverse procedural modeling. ACM Trans. Graph. 29, 4, 104. Google ScholarDigital Library
Boulch, A., Houllier, S., Marlet, R., and Tournaire, O. 2013. Semantizing complex 3D scenes using constrained attribute grammars. In Computer Graphics Forum, vol. 32, Wiley Online Library, 33--42. Google ScholarDigital Library
Chaudhuri, S., Kalogerakis, E., Guibas, L., and Koltun, V. 2011. Probabilistic reasoning for assembly-based 3D modeling. In ACM Trans. Graph., vol. 30, ACM, 35. Google ScholarDigital Library
Choi, W., Chao, Y. W., Pantofaru, C., and Savarese, S. 2013. Understanding indoor scenes using 3D geometric phrases. In CVPR. Google ScholarDigital Library
Earley, J. 1970. An efficient context-free parsing algorithm. Communications of the ACM 13, 2, 94--102. Google ScholarDigital Library
Fisher, M., and Hanrahan, P. 2010. Context-based search for 3D models. In ACM Trans. Graph., vol. 29, ACM, 182. Google ScholarDigital Library
Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. In ACM Trans. Graph., vol. 30, ACM, 34. Google ScholarDigital Library
Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. Graph. 31, 6, 135. Google ScholarDigital Library
Golovinskiy, A., and Funkhouser, T. 2009. Consistent segmentation of 3D models. Computers & Graphics 33, 3, 262--269. Google ScholarDigital Library
Hu, R., Fan, L., and Liu, L. 2012. Co-segmentation of 3D shapes via subspace clustering. In Computer Graphics Forum, vol. 31, Wiley Online Library, 1703--1713. Google ScholarDigital Library
Huang, Q.-X., and Guibas, L. 2013. Consistent shape maps via semidefinite programming. In Computer Graphics Forum, vol. 32, Wiley Online Library, 177--186. Google ScholarDigital Library
Huang, Q., Koltun, V., and Guibas, L. 2011. Joint shape segmentation with linear programming. In ACM Trans. Graph., vol. 30, ACM, 125. Google ScholarDigital Library
Huang, Q.-X., Zhang, G.-X., Gao, L., Hu, S.-M., Butscher, A., and Guibas, L. 2012. An optimization approach for extracting and encoding consistent maps in a shape collection. ACM Trans. Graph. 31, 6, 167. Google ScholarDigital Library
Kalogerakis, E., Hertzmann, A., and Singh, K. 2010. Learning 3D mesh segmentation and labeling. In SIGGRAPH. Google ScholarDigital Library
Kalogerakis, E., Chaudhuri, S., Koller, D., and Koltun, V. 2012. A probabilistic model for component-based shape synthesis. ACM Trans. Graph. 31, 4, 55. Google ScholarDigital Library
Kim, V. G., Li, W., Mitra, N. J., DiVerdi, S., and Funkhouser, T. 2012. Exploring collections of 3D models using fuzzy correspondences. ACM Trans. Graph. 31, 4 (July), 54:1--54:11. Google ScholarDigital Library
Kim, V. G., Li, W., Mitra, N. J., Chaudhuri, S., DiVerdi, S., and Funkhouser, T. 2013. Learning part-based templates from large collections of 3D shapes. ACM Trans. Graph.. Google ScholarDigital Library
Martinović, A., and Van Gool, L. 2013. Bayesian grammar learning for inverse procedural modeling. In CVPR. Google ScholarDigital Library
Mathias, M., Martinovic, A., Weissenberg, J., and van Gool, L. 2011. Procedural 3D building reconstruction using shape grammars and detectors. In 3DIMPVT. Google ScholarDigital Library
Nguyen, A., Ben-Chen, M., Welnicka, K., Ye, Y., and Guibas, L. 2011. An optimization approach to improving collections of shape maps. In CGF, vol. 30, 1481--1491.Google ScholarCross Ref
Parzen, E. 1962. On estimation of a probability density function and mode. Ann. Math. Stat. 33, 3, 1065--1076.Google ScholarCross Ref
Sidi, O., van Kaick, O., Kleiman, Y., Zhang, H., and Cohen-Or, D. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. In ACM Trans. Graph., vol. 30, ACM, 126. Google ScholarDigital Library
Socher, R., Lin, C. C., Ng, A., and Manning, C. 2011. Parsing natural scenes and natural language with recursive neural networks. In ICML, 129--136.Google Scholar
Št'ava, O., Beneš, B., Měch, R., Aliaga, D. G., and Krištof, P. 2010. Inverse procedural modeling by automatic generation of L-systems. In Computer Graphics Forum, vol. 29, Wiley Online Library, 665--674. Google ScholarDigital Library
Talton, J., Yang, L., Kumar, R., Lim, M., Goodman, N., and Měch, R. 2012. Learning design patterns with bayesian grammar induction. In UIST, ACM, 63--74. Google ScholarDigital Library
Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., and Paragios, N. 2013. Parsing facades with shape grammars and reinforcement learning. Trans. PAMI 35, 7, 1744--1756. Google ScholarDigital Library
Trimble, 2012. Trimble 3D warehouse, http://sketchup.google.com/3Dwarehouse/.Google Scholar
van Kaick, O., Xu, K., Zhang, H., Wang, Y., Sun, S., Shamir, A., and Cohen-Or, D. 2013. Co-hierarchical analysis of shape structures. ACM Trans. Graph. 32, 4, 69. Google ScholarDigital Library
Wang, Y., Xu, K., Li, J., Zhang, H., Shamir, A., Liu, L., Cheng, Z., and Xiong, Y. 2011. Symmetry hierarchy of man-made objects. In Computer Graphics Forum, vol. 30,Wiley Online Library, 287--296.Google Scholar
Wu, F., Yan, D.-M., Dong, W., Zhang, X., and Wonka, P. 2014. Inverse procedural modeling of facade layouts. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
Xu, K., Chen, K., Fu, H., Sun, W.-L., and Hu, S.-M. 2013. Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models. ACM Trans. Graph. 32, 4, 123:1--123:12. Google ScholarDigital Library
Xu, K., Ma, R., Zhang, H., Zhu, C., Shamir, A., Cohen-Or, D., and Huang, H. 2014. Organizing heterogeneous scene collection through contextual focal points. ACM Transactions on Graphics, (Proc. of SIGGRAPH 2014) 33, 4, to appear. Google ScholarDigital Library
Yeh, Y.-T., Yang, L., Watson, M., Goodman, N. D., and Hanrahan, P. 2012. Synthesizing open worlds with constraints using locally annealed reversible jump mcmc. ACM Transactions on Graphics (TOG) 31, 4, 56. Google ScholarDigital Library
Zhang, H., Xu, K., Jiang, W., Lin, J., Cohen-Or, D., and Chen, B. 2013. Layered analysis of irregular facades via symmetry maximization. ACM Trans. Graph. 32, 4, 121. Google ScholarDigital Library
Zhao, Y., and Zhu, S.-C. 2013. Scene parsing by integrating function, geometry and appearance models. CVPR. Google ScholarDigital Library
Zheng, Y., Cohen-Or, D., Averkiou, M., and Mitra, N. J. 2014. Recurring part arrangements in shape collections. Computer Graphics Forum (Special issue of Eurographics 2014).Google Scholar

Index Terms

Creating consistent scene graphs using a probabilistic grammar
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
2. Theory of computation
  1. Randomness, geometry and discrete structures

Recommendations

Manhattan Scene Understanding via XSlit Imaging
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

A Manhattan World (MW) is composed of planar surfaces and parallel lines aligned with three mutually orthogonal principal axes. Traditional MW understanding algorithms rely on geometry priors such as the vanishing points and reference (ground) planes ...
Read More
Indoor Scene Understanding with Geometric and Semantic Contexts

Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by ...
Read More
Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

This paper describes a system for interpreting a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships. First we present a method for labeling each ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 33, Issue 6
November 2014
704 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2661229
Issue’s Table of Contents

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 November 2014
Published in tog Volume 33, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
scene collections
scene understanding
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 64
  Total Citations
  View Citations
- 686
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Creating consistent scene graphs using a probabilistic grammar

ACM Transactions on Graphics

Abstract

References

Cited By

Index Terms

Recommendations

Manhattan Scene Understanding via XSlit Imaging

Indoor Scene Understanding with Geometric and Semantic Contexts

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Creating consistent scene graphs using a probabilistic grammar

ACM Transactions on Graphics

Abstract

References

Cited By

Index Terms

Recommendations

Manhattan Scene Understanding via XSlit Imaging

Indoor Scene Understanding with Geometric and Semantic Contexts

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media