Abstract
Growing numbers of 3D scenes in online repositories provide new opportunities for data-driven scene understanding, editing, and synthesis. Despite the plethora of data now available online, most of it cannot be effectively used for data-driven applications because it lacks consistent segmentations, category labels, and/or functional groupings required for co-analysis. In this paper, we develop algorithms that infer such information via parsing with a probabilistic grammar learned from examples. First, given a collection of scene graphs with consistent hierarchies and labels, we train a probabilistic hierarchical grammar to represent the distributions of shapes, cardinalities, and spatial relationships of semantic objects within the collection. Then, we use the learned grammar to parse new scenes to assign them segmentations, labels, and hierarchies consistent with the collection. During experiments with these algorithms, we find that: they work effectively for scene graphs for indoor scenes commonly found online (bedrooms, classrooms, and libraries); they outperform alternative approaches that consider only shape similarities and/or spatial relationships without hierarchy; they require relatively small sets of training data; they are robust to moderate over-segmentation in the inputs; and, they can robustly transfer labels from one data set to another. As a result, the proposed algorithms can be used to provide consistent hierarchies for large collections of scenes within the same semantic class.
- Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc. Google ScholarDigital Library
- Bokeloh, M., Wand, M., and Seidel, H.-P. 2010. A connection between partial symmetry and inverse procedural modeling. ACM Trans. Graph. 29, 4, 104. Google ScholarDigital Library
- Boulch, A., Houllier, S., Marlet, R., and Tournaire, O. 2013. Semantizing complex 3D scenes using constrained attribute grammars. In Computer Graphics Forum, vol. 32, Wiley Online Library, 33--42. Google ScholarDigital Library
- Chaudhuri, S., Kalogerakis, E., Guibas, L., and Koltun, V. 2011. Probabilistic reasoning for assembly-based 3D modeling. In ACM Trans. Graph., vol. 30, ACM, 35. Google ScholarDigital Library
- Choi, W., Chao, Y. W., Pantofaru, C., and Savarese, S. 2013. Understanding indoor scenes using 3D geometric phrases. In CVPR. Google ScholarDigital Library
- Earley, J. 1970. An efficient context-free parsing algorithm. Communications of the ACM 13, 2, 94--102. Google ScholarDigital Library
- Fisher, M., and Hanrahan, P. 2010. Context-based search for 3D models. In ACM Trans. Graph., vol. 29, ACM, 182. Google ScholarDigital Library
- Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. In ACM Trans. Graph., vol. 30, ACM, 34. Google ScholarDigital Library
- Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. Graph. 31, 6, 135. Google ScholarDigital Library
- Golovinskiy, A., and Funkhouser, T. 2009. Consistent segmentation of 3D models. Computers & Graphics 33, 3, 262--269. Google ScholarDigital Library
- Hu, R., Fan, L., and Liu, L. 2012. Co-segmentation of 3D shapes via subspace clustering. In Computer Graphics Forum, vol. 31, Wiley Online Library, 1703--1713. Google ScholarDigital Library
- Huang, Q.-X., and Guibas, L. 2013. Consistent shape maps via semidefinite programming. In Computer Graphics Forum, vol. 32, Wiley Online Library, 177--186. Google ScholarDigital Library
- Huang, Q., Koltun, V., and Guibas, L. 2011. Joint shape segmentation with linear programming. In ACM Trans. Graph., vol. 30, ACM, 125. Google ScholarDigital Library
- Huang, Q.-X., Zhang, G.-X., Gao, L., Hu, S.-M., Butscher, A., and Guibas, L. 2012. An optimization approach for extracting and encoding consistent maps in a shape collection. ACM Trans. Graph. 31, 6, 167. Google ScholarDigital Library
- Kalogerakis, E., Hertzmann, A., and Singh, K. 2010. Learning 3D mesh segmentation and labeling. In SIGGRAPH. Google ScholarDigital Library
- Kalogerakis, E., Chaudhuri, S., Koller, D., and Koltun, V. 2012. A probabilistic model for component-based shape synthesis. ACM Trans. Graph. 31, 4, 55. Google ScholarDigital Library
- Kim, V. G., Li, W., Mitra, N. J., DiVerdi, S., and Funkhouser, T. 2012. Exploring collections of 3D models using fuzzy correspondences. ACM Trans. Graph. 31, 4 (July), 54:1--54:11. Google ScholarDigital Library
- Kim, V. G., Li, W., Mitra, N. J., Chaudhuri, S., DiVerdi, S., and Funkhouser, T. 2013. Learning part-based templates from large collections of 3D shapes. ACM Trans. Graph.. Google ScholarDigital Library
- Martinović, A., and Van Gool, L. 2013. Bayesian grammar learning for inverse procedural modeling. In CVPR. Google ScholarDigital Library
- Mathias, M., Martinovic, A., Weissenberg, J., and van Gool, L. 2011. Procedural 3D building reconstruction using shape grammars and detectors. In 3DIMPVT. Google ScholarDigital Library
- Nguyen, A., Ben-Chen, M., Welnicka, K., Ye, Y., and Guibas, L. 2011. An optimization approach to improving collections of shape maps. In CGF, vol. 30, 1481--1491.Google ScholarCross Ref
- Parzen, E. 1962. On estimation of a probability density function and mode. Ann. Math. Stat. 33, 3, 1065--1076.Google ScholarCross Ref
- Sidi, O., van Kaick, O., Kleiman, Y., Zhang, H., and Cohen-Or, D. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. In ACM Trans. Graph., vol. 30, ACM, 126. Google ScholarDigital Library
- Socher, R., Lin, C. C., Ng, A., and Manning, C. 2011. Parsing natural scenes and natural language with recursive neural networks. In ICML, 129--136.Google Scholar
- Št'ava, O., Beneš, B., Měch, R., Aliaga, D. G., and Krištof, P. 2010. Inverse procedural modeling by automatic generation of L-systems. In Computer Graphics Forum, vol. 29, Wiley Online Library, 665--674. Google ScholarDigital Library
- Talton, J., Yang, L., Kumar, R., Lim, M., Goodman, N., and Měch, R. 2012. Learning design patterns with bayesian grammar induction. In UIST, ACM, 63--74. Google ScholarDigital Library
- Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., and Paragios, N. 2013. Parsing facades with shape grammars and reinforcement learning. Trans. PAMI 35, 7, 1744--1756. Google ScholarDigital Library
- Trimble, 2012. Trimble 3D warehouse, http://sketchup.google.com/3Dwarehouse/.Google Scholar
- van Kaick, O., Xu, K., Zhang, H., Wang, Y., Sun, S., Shamir, A., and Cohen-Or, D. 2013. Co-hierarchical analysis of shape structures. ACM Trans. Graph. 32, 4, 69. Google ScholarDigital Library
- Wang, Y., Xu, K., Li, J., Zhang, H., Shamir, A., Liu, L., Cheng, Z., and Xiong, Y. 2011. Symmetry hierarchy of man-made objects. In Computer Graphics Forum, vol. 30,Wiley Online Library, 287--296.Google Scholar
- Wu, F., Yan, D.-M., Dong, W., Zhang, X., and Wonka, P. 2014. Inverse procedural modeling of facade layouts. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
- Xu, K., Chen, K., Fu, H., Sun, W.-L., and Hu, S.-M. 2013. Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models. ACM Trans. Graph. 32, 4, 123:1--123:12. Google ScholarDigital Library
- Xu, K., Ma, R., Zhang, H., Zhu, C., Shamir, A., Cohen-Or, D., and Huang, H. 2014. Organizing heterogeneous scene collection through contextual focal points. ACM Transactions on Graphics, (Proc. of SIGGRAPH 2014) 33, 4, to appear. Google ScholarDigital Library
- Yeh, Y.-T., Yang, L., Watson, M., Goodman, N. D., and Hanrahan, P. 2012. Synthesizing open worlds with constraints using locally annealed reversible jump mcmc. ACM Transactions on Graphics (TOG) 31, 4, 56. Google ScholarDigital Library
- Zhang, H., Xu, K., Jiang, W., Lin, J., Cohen-Or, D., and Chen, B. 2013. Layered analysis of irregular facades via symmetry maximization. ACM Trans. Graph. 32, 4, 121. Google ScholarDigital Library
- Zhao, Y., and Zhu, S.-C. 2013. Scene parsing by integrating function, geometry and appearance models. CVPR. Google ScholarDigital Library
- Zheng, Y., Cohen-Or, D., Averkiou, M., and Mitra, N. J. 2014. Recurring part arrangements in shape collections. Computer Graphics Forum (Special issue of Eurographics 2014).Google Scholar
Index Terms
- Creating consistent scene graphs using a probabilistic grammar
Recommendations
Manhattan Scene Understanding via XSlit Imaging
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern RecognitionA Manhattan World (MW) is composed of planar surfaces and parallel lines aligned with three mutually orthogonal principal axes. Traditional MW understanding algorithms rely on geometry priors such as the vanishing points and reference (ground) planes ...
Indoor Scene Understanding with Geometric and Semantic Contexts
Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by ...
Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors
This paper describes a system for interpreting a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships. First we present a method for labeling each ...
Comments