Skip to main content
Top

2025 | OriginalPaper | Chapter

Octopus: Embodied Vision-Language Programmer from Environmental Feedback

Authors : Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang, Haoran Tan, Chencheng Jiang, Jiamu Kang, Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu

Published in: Computer Vision – ECCV 2024

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The chapter 'Octopus: Embodied Vision-Language Programmer from Environmental Feedback' introduces Octopus, a groundbreaking vision-language programmer that translates natural language instructions into executable code. Octopus leverages egocentric vision to generate plans and code, addressing the gap between high-level planning and real-world manipulation. The authors present the OctoVerse environment, a suite of simulators including OctoGibson, OctoMC, and OctoGTA, designed to train and benchmark Octopus. The model is trained using Reinforcement Learning with Environmental Feedback (RLEF), demonstrating superior performance in task planning, code generation, and execution. The chapter also highlights the importance of vision-dependent function calls and the challenges of existing simulators. By open-sourcing their environments, dataset, and architecture, the authors aim to foster innovation in the field of embodied vision-language programming.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 102.000 books
  • more than 537 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 67.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 67.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Appendix
This content is only visible if you are logged in and have the appropriate permissions.
Footnotes
This content is only visible if you are logged in and have the appropriate permissions.
Literature
This content is only visible if you are logged in and have the appropriate permissions.
Metadata
Title
Octopus: Embodied Vision-Language Programmer from Environmental Feedback
Authors
Jingkang Yang
Yuhao Dong
Shuai Liu
Bo Li
Ziyue Wang
Haoran Tan
Chencheng Jiang
Jiamu Kang
Yuanhan Zhang
Kaiyang Zhou
Ziwei Liu
Copyright Year
2025
DOI
https://doi.org/10.1007/978-3-031-73232-4_2

Premium Partner