Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

Physical Primitive Decomposition

verfasst von : Zhijian Liu, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Objects are made of parts, each with distinct geometry, physics, functionality, and affordances. Developing such a distributed, physical, interpretable representation of objects will facilitate intelligent agents to better explore and interact with the world. In this paper, we study physical primitive decomposition—understanding an object through its components, each with physical and geometric attributes. As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object’s appearance and its behaviors in physical events. Our model performs well on block towers and tools in both synthetic and real scenarios; we also demonstrate that visual and physical observations often provide complementary signals. We further present ablation and behavioral studies to better understand our model and contrast it with human performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Agrawal, P., Nair, A., Abbeel, P., Malik, J., Levine, S.: Learning to poke by poking: experiential learning of intuitive physics. In: NIPS (2016) Agrawal, P., Nair, A., Abbeel, P., Malik, J., Levine, S.: Learning to poke by poking: experiential learning of intuitive physics. In: NIPS (2016)
2.
Zurück zum Zitat Attene, M., Falcidieno, B., Spagnuolo, M.: Hierarchical mesh segmentation based on fitting primitives. Vis. Comput. 22(3), 181–193 (2006)CrossRef Attene, M., Falcidieno, B., Spagnuolo, M.: Hierarchical mesh segmentation based on fitting primitives. Vis. Comput. 22(3), 181–193 (2006)CrossRef
3.
Zurück zum Zitat Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. PNAS 110(45), 18327–18332 (2013)CrossRef Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. PNAS 110(45), 18327–18332 (2013)CrossRef
4.
Zurück zum Zitat Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: CVPR (2015) Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: CVPR (2015)
5.
Zurück zum Zitat Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)CrossRef Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)CrossRef
6.
Zurück zum Zitat Binford, T.O.: Visual perception by computer. In: IEEE Conference on Systems and Control (1971) Binford, T.O.: Visual perception by computer. In: IEEE Conference on Systems and Control (1971)
7.
Zurück zum Zitat Brubaker, M.A., Fleet, D.J., Hertzmann, A.: Physics-based person tracking using the anthropomorphic walker. IJCV 87(1–2), 140 (2010)CrossRef Brubaker, M.A., Fleet, D.J., Hertzmann, A.: Physics-based person tracking using the anthropomorphic walker. IJCV 87(1–2), 140 (2010)CrossRef
10.
Zurück zum Zitat Grabner, H., Gall, J., Van Gool, L.: What makes a chair a chair? In: CVPR (2011) Grabner, H., Gall, J., Van Gool, L.: What makes a chair a chair? In: CVPR (2011)
12.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015)
13.
Zurück zum Zitat van den Hengel, A., et al.: Part-based modelling of compound scenes from images. In: CVPR (2015) van den Hengel, A., et al.: Part-based modelling of compound scenes from images. In: CVPR (2015)
14.
Zurück zum Zitat Huang, H., Kalogerakis, E., Marlin, B.: Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. CGF 34(5), 25–38 (2015) Huang, H., Kalogerakis, E., Marlin, B.: Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. CGF 34(5), 25–38 (2015)
15.
Zurück zum Zitat Huang, Q., Wang, H., Koltun, V.: Single-view reconstruction via joint analysis of image and shape collections. ACM TOG 34(4), 87 (2015) Huang, Q., Wang, H., Koltun, V.: Single-view reconstruction via joint analysis of image and shape collections. ACM TOG 34(4), 87 (2015)
16.
Zurück zum Zitat Igarashi, T., Matsuoka, S., Tanaka, H.: Teddy: a sketching interface for 3D freeform design. In: SIGGRAPH (1999) Igarashi, T., Matsuoka, S., Tanaka, H.: Teddy: a sketching interface for 3D freeform design. In: SIGGRAPH (1999)
17.
Zurück zum Zitat Gibson, J.J.: The theory of affordances. In: The Ecological Approach to Visual Perception, chap. 8 (1977) Gibson, J.J.: The theory of affordances. In: The Ecological Approach to Visual Perception, chap. 8 (1977)
18.
Zurück zum Zitat Jia, Z., Gallagher, A., Saxena, A., Chen, T.: 3D reasoning from blocks to stability. IEEE TPAMI 37(5), 905–918 (2015)CrossRef Jia, Z., Gallagher, A., Saxena, A., Chen, T.: 3D reasoning from blocks to stability. IEEE TPAMI 37(5), 905–918 (2015)CrossRef
19.
Zurück zum Zitat Kalogerakis, E., Chaudhuri, S., Koller, D., Koltun, V.: A probabilistic model for component-based shape synthesis. ACM TOG 31(4), 55 (2012)CrossRef Kalogerakis, E., Chaudhuri, S., Koller, D., Koltun, V.: A probabilistic model for component-based shape synthesis. ACM TOG 31(4), 55 (2012)CrossRef
20.
Zurück zum Zitat Kim, M., et al.: Data-driven physics for human soft tissue animation. In: SIGGRAPH (2017)CrossRef Kim, M., et al.: Data-driven physics for human soft tissue animation. In: SIGGRAPH (2017)CrossRef
21.
Zurück zum Zitat Kim, V.G., Li, W., Mitra, N.J., Chaudhuri, S., DiVerdi, S., Funkhouser, T.: Learning part-based templates from large collections of 3D shapes. ACM TOG 32(4), 70 (2013)MATH Kim, V.G., Li, W., Mitra, N.J., Chaudhuri, S., DiVerdi, S., Funkhouser, T.: Learning part-based templates from large collections of 3D shapes. ACM TOG 32(4), 70 (2013)MATH
23.
Zurück zum Zitat Lerer, A., Gross, S., Fergus, R.: Learning physical intuition of block towers by example. In: ICML (2016) Lerer, A., Gross, S., Fergus, R.: Learning physical intuition of block towers by example. In: ICML (2016)
24.
Zurück zum Zitat Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)MathSciNetCrossRef Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)MathSciNetCrossRef
25.
Zurück zum Zitat Li, J., Xu, K., Chaudhuri, S., Yumer, E., Zhang, H., Guibas, L.: GRASS: generative recursive autoencoders for shape structures. In: SIGGRAPH (2017) Li, J., Xu, K., Chaudhuri, S., Yumer, E., Zhang, H., Guibas, L.: GRASS: generative recursive autoencoders for shape structures. In: SIGGRAPH (2017)
26.
Zurück zum Zitat Li, W., Leonardis, A., Fritz, M.: Visual stability prediction for robotic manipulation. In: ICRA (2017) Li, W., Leonardis, A., Fritz, M.: Visual stability prediction for robotic manipulation. In: ICRA (2017)
27.
Zurück zum Zitat Li, Y., Wu, X., Chrysathou, Y., Sharf, A., Cohen-Or, D., Mitra, N.J.: GlobFit: consistently fitting primitives by discovering global relations. ACM TOG 30(4), 52 (2011) Li, Y., Wu, X., Chrysathou, Y., Sharf, A., Cohen-Or, D., Mitra, N.J.: GlobFit: consistently fitting primitives by discovering global relations. ACM TOG 30(4), 52 (2011)
28.
Zurück zum Zitat Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)MathSciNetCrossRef Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)MathSciNetCrossRef
29.
Zurück zum Zitat Miller, A.T., Knoop, S., Christensen, H.I., Allen, P.K.: Automatic grasp planning using shape primitives. In: ICRA (2003) Miller, A.T., Knoop, S., Christensen, H.I., Allen, P.K.: Automatic grasp planning using shape primitives. In: ICRA (2003)
31.
Zurück zum Zitat Nooruddin, F.S., Turk, G.: Simplification and repair of polygonal models using volumetric techniques. IEEE TVCG 9(2), 191–205 (2003) Nooruddin, F.S., Turk, G.: Simplification and repair of polygonal models using volumetric techniques. IEEE TVCG 9(2), 191–205 (2003)
32.
Zurück zum Zitat Pham, T.H., Kheddar, A., Qammaz, A., Argyros, A.A.: Towards force sensing from vision: observing hand-object interactions to infer manipulation forces. In: CVPR (2015) Pham, T.H., Kheddar, A., Qammaz, A., Argyros, A.A.: Towards force sensing from vision: observing hand-object interactions to infer manipulation forces. In: CVPR (2015)
33.
Zurück zum Zitat Rivlin, E., Dickinson, S.J., Rosenfeld, A.: Recognition by functional parts. CVIU 62(2), 164–176 (1995)MATH Rivlin, E., Dickinson, S.J., Rosenfeld, A.: Recognition by functional parts. CVIU 62(2), 164–176 (1995)MATH
34.
Zurück zum Zitat Roberts, L.G.: Machine perception of three-dimensional solids. Ph.D. thesis, Massachusetts Institute of Technology (1963) Roberts, L.G.: Machine perception of three-dimensional solids. Ph.D. thesis, Massachusetts Institute of Technology (1963)
35.
Zurück zum Zitat Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. IJCV 40(2), 99–121 (2000)CrossRef Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. IJCV 40(2), 99–121 (2000)CrossRef
36.
Zurück zum Zitat Savva, M., Chang, A.X., Hanrahan, P.: Semantically-enriched 3D models for common-sense knowledge. In: CVPR Workshop (2015) Savva, M., Chang, A.X., Hanrahan, P.: Semantically-enriched 3D models for common-sense knowledge. In: CVPR Workshop (2015)
37.
Zurück zum Zitat Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. CGF 28(2), 503–512 (2009) Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. CGF 28(2), 503–512 (2009)
38.
Zurück zum Zitat Soo Park, H., Shi, J., et al.: Force from motion: decoding physical sensation in a first person video. In: CVPR (2016) Soo Park, H., Shi, J., et al.: Force from motion: decoding physical sensation in a first person video. In: CVPR (2016)
39.
Zurück zum Zitat Tenenbaum, J.B.: Functional parts. In: CogSci (1994) Tenenbaum, J.B.: Functional parts. In: CogSci (1994)
40.
Zurück zum Zitat Tulsiani, S., Su, H., Guibas, L.J., Efros, A.A., Malik, J.: Learning shape abstractions by assembling volumetric primitives. In: CVPR (2017) Tulsiani, S., Su, H., Guibas, L.J., Efros, A.A., Malik, J.: Learning shape abstractions by assembling volumetric primitives. In: CVPR (2017)
41.
Zurück zum Zitat Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: ICML (2016) Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: ICML (2016)
42.
Zurück zum Zitat Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. MLJ 8(3–4), 229–256 (1992)MATH Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. MLJ 8(3–4), 229–256 (1992)MATH
43.
Zurück zum Zitat Wu, J., Lim, J.J., Zhang, H., Tenenbaum, J.B., Freeman, W.T.: Physics 101: Learning physical object properties from unlabeled videos. In: BMVC (2016) Wu, J., Lim, J.J., Zhang, H., Tenenbaum, J.B., Freeman, W.T.: Physics 101: Learning physical object properties from unlabeled videos. In: BMVC (2016)
44.
Zurück zum Zitat Wu, J., Lu, E., Kohli, P., Freeman, W.T., Tenenbaum, J.B.: Learning to see physics via visual de-animation. In: NIPS (2017) Wu, J., Lu, E., Kohli, P., Freeman, W.T., Tenenbaum, J.B.: Learning to see physics via visual de-animation. In: NIPS (2017)
45.
Zurück zum Zitat Wu, J., Yildirim, I., Lim, J.J., Freeman, W.T., Tenenbaum, J.B.: Galileo: perceiving physical object properties by integrating a physics engine with deep learning. In: NIPS (2015) Wu, J., Yildirim, I., Lim, J.J., Freeman, W.T., Tenenbaum, J.B.: Galileo: perceiving physical object properties by integrating a physics engine with deep learning. In: NIPS (2015)
46.
Zurück zum Zitat Yao, B., Ma, J., Fei-Fei, L.: Discovering object functionality. In: ICCV (2013) Yao, B., Ma, J., Fei-Fei, L.: Discovering object functionality. In: ICCV (2013)
47.
Zurück zum Zitat Yumer, M.E., Kara, L.B.: Co-abstraction of shape collections. ACM TOG 31(6), 166 (2012)CrossRef Yumer, M.E., Kara, L.B.: Co-abstraction of shape collections. ACM TOG 31(6), 166 (2012)CrossRef
48.
Zurück zum Zitat Zhao, Y., Zhu, S.C.: Scene parsing by integrating function, geometry and appearance models. In: CVPR (2013) Zhao, Y., Zhu, S.C.: Scene parsing by integrating function, geometry and appearance models. In: CVPR (2013)
49.
Zurück zum Zitat Zheng, D., Luo, V., Wu, J., Tenenbaum, J.B.: Unsupervised learning of latent physical properties using perception-prediction networks. In: UAI (2018) Zheng, D., Luo, V., Wu, J., Tenenbaum, J.B.: Unsupervised learning of latent physical properties using perception-prediction networks. In: UAI (2018)
50.
Zurück zum Zitat Zheng, Y., Cohen-Or, D., Averkiou, M., Mitra, N.J.: Recurring part arrangements in shape collections. CGF 33(2), 115–124 (2014) Zheng, Y., Cohen-Or, D., Averkiou, M., Mitra, N.J.: Recurring part arrangements in shape collections. CGF 33(2), 115–124 (2014)
51.
Zurück zum Zitat Zhu, Y., Zhao, Y., Zhu, S.C.: Understanding tools: Task-oriented object modeling, learning and recognition. In: CVPR (2015) Zhu, Y., Zhao, Y., Zhu, S.C.: Understanding tools: Task-oriented object modeling, learning and recognition. In: CVPR (2015)
53.
Zurück zum Zitat Zou, C., Yumer, E., Yang, J., Ceylan, D., Hoiem, D.: 3D-PRNN: generating shape primitives with recurrent neural networks. In: ICCV (2017) Zou, C., Yumer, E., Yang, J., Ceylan, D., Hoiem, D.: 3D-PRNN: generating shape primitives with recurrent neural networks. In: ICCV (2017)
Metadaten
Titel
Physical Primitive Decomposition
verfasst von
Zhijian Liu
William T. Freeman
Joshua B. Tenenbaum
Jiajun Wu
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01258-8_1

Premium Partner