nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

1. Ambrosinus-Toolkit Plugin: Artificial Intelligence Text-to-Image Generative Models Through Grasshopper

verfasst von : Luciano Ambrosini

Erschienen in: Coding Architecture

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Large-scale Text Generative Models known as DALL-E and Stable Diffusion offer incredible opportunities to improve and enhance the creation and manipulation of design communication through images. The toolkit developed by the author shows how to integrate the LTGMs technology introduced by the OpenAI and StabilityAI platforms into the creative design process as well as the Dense Prediction Transformers (DPT) technology capable to predict a 3D-shaped object started from a 2D image generated by the AI. This contribution will provide the key elements to understand some fundamental aspects of research on artificial intelligence, while through the “Ambrosinus-Toolkit” project, publicly shared on the network, the reader will be provided with operating cues that can be experienced during the schematic design stage. The research presented bears witness to a first step foot with a new paradigm that uses Diffusion Models in the computational design workflow.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel EOC ECO2 Plugin: Automated Embodied Carbon Calculations Within Autodesk Revit

Google Colab or “Colaboratory” is a browser-based programming platform that allows students, researchers and simple users to execute Python code and exploit GPU hardware remotely for free or by subscription. Google Colab Homepage, https://colab.research.google.com/, last accessed 2023/01/31.

Midjourney works basically as a Discord bot, an automated module within the instant messaging platform designed for communication between gamer communities and internet users. MJ is a tool best known for its artistic-style results.

The “node” logic is a user interface simplification gimmick already known in the Maya Hypergraph language. In the early 2000s, the lack of agreement with Bentley Systems gave way to the development of Grasshopper (David Rutten, 2008) strongly detaching from the GenerativeComponents project (John Nastasi, 2003) forerunner of VPL platforms.

GAN are two neural networks: a generator that makes new data and a discriminator that tells real data from fake data. They are trained in a game where the generator tries to fool the discriminator and the discriminator tries to catch the generator.

The other two modes are the Supervised Learning (SL), that uses labeled data to train a model and the Unsupervised Learning (UL), that uses unlabeled data to find patterns and structure. Anyway, exist also another mode defined as Self-supervised learning (SSL), that is a kind of UL that creates artificial labels from raw data.

Transformers, introduced in 2017, are a type of neural network for processing sequential data such as text. They use a self-attention mechanism to understand the meaning of words in context by weighing their importance in the input sequence.

Required Python libraries: atoolkitdpt, opencv-python, Pillow, numpy, matplotlib, transformers, torch, torchvision, imutils, open3d, openai, stability-sdk, timm. The documentation is reported at, https://bit.ly/LA-WYSIWYTfromDPTto3D and will be publicly shared through ATk v1.1.8/v1.1.9.

The Stable Diffusion engines included in ATk are: v1, v1.5, v2, v2.1, inpainting v1 and inpainting v2.

Sampler models included in ATk are: ddim, ddpm, k-dpm 2, k-dpm 2 ancestral, k-dpmpp 2m, k-dpmpp s2 ancestral, k-euler, k-euler ancestral, k-heun, k-lms.

Video demo, https://youtu.be/1VXe-R44nuw, last accessed 2023/02/13.

PyPI repository link, https://pypi.org/project/atoolkitdpt/, last accessed 2023/02/10.

Global–Local Path Networks for Monocular Depth Estimation with Vertical CutDepth, https://github.com/vinvino02/GLPDepth, last accessed 2023/02/10.

DPT weights models for the semantic segmentation, https://github.com/isl-org/DPT, last accessed 2023/02/10.

A further example regarding the variation is provided by the author in this video demo, https://youtu.be/UPTc5o6rOLM, last accessed 2023/02/13.

A further example regarding the editing is provided by the author in this video demo, https://youtu.be/nmJTP1QvuIE, last accessed 2023/02/13.

Video demo, https://bit.ly/LA-DPTto3Dobj, last accessed 2023/02/21.

In the last few months Ambrosinus-Toolkit has received several implementations, including the possibility of using Stable Diffusion on a local machine and as a rendering engine, thanks to the project called “Automatic1111”. ATk development link, https://bit.ly/SDandCNinsideGH, last accessed 2023/06/23.

Domingos P (2015) The master algorithm: how the quest for the ultimate learning machine will remake our world. First paperback, Basic Books, New York

Article link. https://bit.ly/SneakPeek-Expo2021-iMesh. Accessed 1 Feb 2023

Chaillou S (2019) The advent of architectural AI. Towards Data Science. Article link. https://towardsdatascience.com/the-advent-of-architectural-ai-706046960140. Accessed 2 Feb 2023

Chaillou S (2020) ArchiGAN: artificial intelligence x architecture. In: Architectural intelligence selected papers from the 1st international conference on computational design and robotic fabrication (CDRF 2019). Springer, Singapore, pp 117–127

Sevilla J, Heim L, Ho A, Besiroglu T, Hobbhahn M, Villalobos P (2022) Compute trends across three eras of machine learning. In: International joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN55064.2022.9891914. Accessed 2 Feb 2023

A high resolution of the chart entitled “three eras of the ML” elaborated by the author, is available here. https://bit.ly/LA-3ErasMLchrt. Accessed 7 Feb 2023

The State of AI Report analyses the most interesting developments in AI. The Report is produced by AI investors Nathan Benaich and Ian Hogarth. https://www.stateof.ai/. Accessed 2 Feb 2023

“Law of Accelerating Returns” article link. https://www.kurzweilai.net/the-law-of-accelerating-returns. Accessed 2 Feb 2023

Article link. https://openai.com/blog/introducing-openai/. Accessed 3 Feb 2023

10.

Hy L (2022) Large-scale artificial intelligence models. Computer 55(05):76–80

11.

AI-text classifier homepage. https://platform.openai.com/ai-text-classifier. Accessed 3 Feb 2023

12.

Techcrunch article. https://tinyurl.com/NewYorkPublicSchool. Accessed 3 Feb 2023

13.

Article link. https://tinyurl.com/MITtechReviewArticle. Accessed 4 Feb 2023

14.

Article link. https://blog.google/technology/ai/bard-google-ai-search-updates/. Accessed 7 Feb 2023

15.

Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. ArXiv abs/2006.11239. Accessed 3 Feb 2023

16.

Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M (2021) GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741. Accessed 2 Feb 2023

17.

Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with CLIP Latents. ArXiv abs/2204.06125. Accessed 2 Feb 2023

18.

Inpainting and outpainting prototype tool homepage. https://github.com/lkwq007/stablediffusion-infinity. Accessed 7 Feb 2023

19.

Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2021) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10674–10685

20.

Intel Isl work, and models are freely. https://github.com/intel-isl/MiDaS. Accessed 9 Feb 2023

21.

Ranftl R, Lasinger K, Hafner D, Schindler K, Koltun V (2019) Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. In: Proceedings of the IEEE transactions on pattern analysis and machine intelligence, vol 44, pp 1623–1637

22.

Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. Lecture notes in computer science, vol 7576. Springer, Berlin, Heidelberg, pp 746–760

23.

Stanford University Human-Centered AI report. https://aiindex.stanford.edu/report/. Accessed 10 Feb 2023

24.

CSET full report. https://cset.georgetown.edu/publication/counting-ai-research/. Accessed 16 Feb 2023

25.

Ambrosini L (2018) Data, digital & design—Produzione del progetto digitale e processi decisionali: la progettazione “flessibile” nell’Era dello Scripting e del Building Information Modelling come nuovo paradigma tecnologico. PhD thesis in Architecture XXXI cycle, DiARC, University of Naples “Federico II”. https://doi.org/10.13140/RG.2.2.27158.29769

26.

Turchi T, Carta S, Ambrosini L, Malizia A (2023) Human-AI co-creation: evaluating the impact of large-scale text-to-image generative models on the creative process. In: Spano LD, Schmidt A, Santoro C, Stumpf S (eds) End-user development. IS-EUD 2023. Lecture notes in computer science, vol 13917. Springer, Cham. https://doi.org/10.1007/978-3-031-34433-6_3

Titel: Ambrosinus-Toolkit Plugin: Artificial Intelligence Text-to-Image Generative Models Through Grasshopper
verfasst von: Luciano Ambrosini
Verlag: Springer Nature Switzerland
Buch: Coding Architecture
Print ISBN: 978-3-031-47912-0

Electronic ISBN: 978-3-031-47913-7

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-47913-7_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"