nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Physical Primitive Decomposition

verfasst von : Zhijian Liu, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Objects are made of parts, each with distinct geometry, physics, functionality, and affordances. Developing such a distributed, physical, interpretable representation of objects will facilitate intelligent agents to better explore and interact with the world. In this paper, we study physical primitive decomposition—understanding an object through its components, each with physical and geometric attributes. As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object’s appearance and its behaviors in physical events. Our model performs well on block towers and tools in both synthetic and real scenarios; we also demonstrate that visual and physical observations often provide complementary signals. We further present ablation and behavioral studies to better understand our model and contrast it with human performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Deep Attention Neural Tensor Network for Visual Question Answering

Nur mit Berechtigung zugänglich

https://3dwarehouse.sketchup.com.

Agrawal, P., Nair, A., Abbeel, P., Malik, J., Levine, S.: Learning to poke by poking: experiential learning of intuitive physics. In: NIPS (2016)

Attene, M., Falcidieno, B., Spagnuolo, M.: Hierarchical mesh segmentation based on fitting primitives. Vis. Comput. 22(3), 181–193 (2006)CrossRef

Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. PNAS 110(45), 18327–18332 (2013)CrossRef

Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: CVPR (2015)

Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)CrossRef

Binford, T.O.: Visual perception by computer. In: IEEE Conference on Systems and Control (1971)

Brubaker, M.A., Fleet, D.J., Hertzmann, A.: Physics-based person tracking using the anthropomorphic walker. IJCV 87(1–2), 140 (2010)CrossRef

Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. arXiv:1512.03012 (2015)

Coumans, E.: Bullet physics engine. Open Source Software (2010). http://bulletphysics.org

10.

Grabner, H., Gall, J., Van Gool, L.: What makes a chair a chair? In: CVPR (2011)

11.

Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_35CrossRef

12.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015)

13.

van den Hengel, A., et al.: Part-based modelling of compound scenes from images. In: CVPR (2015)

14.

Huang, H., Kalogerakis, E., Marlin, B.: Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. CGF 34(5), 25–38 (2015)

15.

Huang, Q., Wang, H., Koltun, V.: Single-view reconstruction via joint analysis of image and shape collections. ACM TOG 34(4), 87 (2015)

16.

Igarashi, T., Matsuoka, S., Tanaka, H.: Teddy: a sketching interface for 3D freeform design. In: SIGGRAPH (1999)

17.

Gibson, J.J.: The theory of affordances. In: The Ecological Approach to Visual Perception, chap. 8 (1977)

18.

Jia, Z., Gallagher, A., Saxena, A., Chen, T.: 3D reasoning from blocks to stability. IEEE TPAMI 37(5), 905–918 (2015)CrossRef

19.

Kalogerakis, E., Chaudhuri, S., Koller, D., Koltun, V.: A probabilistic model for component-based shape synthesis. ACM TOG 31(4), 55 (2012)CrossRef

20.

Kim, M., et al.: Data-driven physics for human soft tissue animation. In: SIGGRAPH (2017)CrossRef

21.

Kim, V.G., Li, W., Mitra, N.J., Chaudhuri, S., DiVerdi, S., Funkhouser, T.: Learning part-based templates from large collections of 3D shapes. ACM TOG 32(4), 70 (2013)MATH

22.

Koppula, H.S., Saxena, A.: Physically grounded spatio-temporal object affordances. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 831–847. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_54CrossRef

23.

Lerer, A., Gross, S., Fergus, R.: Learning physical intuition of block towers by example. In: ICML (2016)

24.

Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)MathSciNetCrossRef

25.

Li, J., Xu, K., Chaudhuri, S., Yumer, E., Zhang, H., Guibas, L.: GRASS: generative recursive autoencoders for shape structures. In: SIGGRAPH (2017)

26.

Li, W., Leonardis, A., Fritz, M.: Visual stability prediction for robotic manipulation. In: ICRA (2017)

27.

Li, Y., Wu, X., Chrysathou, Y., Sharf, A., Cohen-Or, D., Mitra, N.J.: GlobFit: consistently fitting primitives by discovering global relations. ACM TOG 30(4), 52 (2011)

28.

Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)MathSciNetCrossRef

29.

Miller, A.T., Knoop, S., Christensen, H.I., Allen, P.K.: Automatic grasp planning using shape primitives. In: ICRA (2003)

30.

Mottaghi, R., Rastegari, M., Gupta, A., Farhadi, A.: “What happens if...” learning to predict the effect of forces in images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 269–285. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_17CrossRef

31.

Nooruddin, F.S., Turk, G.: Simplification and repair of polygonal models using volumetric techniques. IEEE TVCG 9(2), 191–205 (2003)

32.

Pham, T.H., Kheddar, A., Qammaz, A., Argyros, A.A.: Towards force sensing from vision: observing hand-object interactions to infer manipulation forces. In: CVPR (2015)

33.

Rivlin, E., Dickinson, S.J., Rosenfeld, A.: Recognition by functional parts. CVIU 62(2), 164–176 (1995)MATH

34.

Roberts, L.G.: Machine perception of three-dimensional solids. Ph.D. thesis, Massachusetts Institute of Technology (1963)

35.

Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. IJCV 40(2), 99–121 (2000)CrossRef

36.

Savva, M., Chang, A.X., Hanrahan, P.: Semantically-enriched 3D models for common-sense knowledge. In: CVPR Workshop (2015)

37.

Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. CGF 28(2), 503–512 (2009)

38.

Soo Park, H., Shi, J., et al.: Force from motion: decoding physical sensation in a first person video. In: CVPR (2016)

39.

Tenenbaum, J.B.: Functional parts. In: CogSci (1994)

40.

Tulsiani, S., Su, H., Guibas, L.J., Efros, A.A., Malik, J.: Learning shape abstractions by assembling volumetric primitives. In: CVPR (2017)

41.

Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: ICML (2016)

42.

Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. MLJ 8(3–4), 229–256 (1992)MATH

43.

Wu, J., Lim, J.J., Zhang, H., Tenenbaum, J.B., Freeman, W.T.: Physics 101: Learning physical object properties from unlabeled videos. In: BMVC (2016)

44.

Wu, J., Lu, E., Kohli, P., Freeman, W.T., Tenenbaum, J.B.: Learning to see physics via visual de-animation. In: NIPS (2017)

45.

Wu, J., Yildirim, I., Lim, J.J., Freeman, W.T., Tenenbaum, J.B.: Galileo: perceiving physical object properties by integrating a physics engine with deep learning. In: NIPS (2015)

46.

Yao, B., Ma, J., Fei-Fei, L.: Discovering object functionality. In: ICCV (2013)

47.

Yumer, M.E., Kara, L.B.: Co-abstraction of shape collections. ACM TOG 31(6), 166 (2012)CrossRef

48.

Zhao, Y., Zhu, S.C.: Scene parsing by integrating function, geometry and appearance models. In: CVPR (2013)

49.

Zheng, D., Luo, V., Wu, J., Tenenbaum, J.B.: Unsupervised learning of latent physical properties using perception-prediction networks. In: UAI (2018)

50.

Zheng, Y., Cohen-Or, D., Averkiou, M., Mitra, N.J.: Recurring part arrangements in shape collections. CGF 33(2), 115–124 (2014)

51.

Zhu, Y., Zhao, Y., Zhu, S.C.: Understanding tools: Task-oriented object modeling, learning and recognition. In: CVPR (2015)

52.

Zhu, Y., Fathi, A., Fei-Fei, L.: Reasoning about object affordances in a knowledge base representation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 408–424. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_27CrossRef

53.

Zou, C., Yumer, E., Yang, J., Ceylan, D., Hoiem, D.: 3D-PRNN: generating shape primitives with recurrent neural networks. In: ICCV (2017)

Titel: Physical Primitive Decomposition
verfasst von: Zhijian Liu
William T. Freeman
Joshua B. Tenenbaum
Jiajun Wu
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01257-1

Electronic ISBN: 978-3-030-01258-8

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01258-8_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner