Skip to main content
Erschienen in:
Buchtitelbild

2024 | OriginalPaper | Buchkapitel

1. Ambrosinus-Toolkit Plugin: Artificial Intelligence Text-to-Image Generative Models Through Grasshopper

verfasst von : Luciano Ambrosini

Erschienen in: Coding Architecture

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Large-scale Text Generative Models known as DALL-E and Stable Diffusion offer incredible opportunities to improve and enhance the creation and manipulation of design communication through images. The toolkit developed by the author shows how to integrate the LTGMs technology introduced by the OpenAI and StabilityAI platforms into the creative design process as well as the Dense Prediction Transformers (DPT) technology capable to predict a 3D-shaped object started from a 2D image generated by the AI. This contribution will provide the key elements to understand some fundamental aspects of research on artificial intelligence, while through the “Ambrosinus-Toolkit” project, publicly shared on the network, the reader will be provided with operating cues that can be experienced during the schematic design stage. The research presented bears witness to a first step foot with a new paradigm that uses Diffusion Models in the computational design workflow.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Google Colab or “Colaboratory” is a browser-based programming platform that allows students, researchers and simple users to execute Python code and exploit GPU hardware remotely for free or  by subscription. Google Colab Homepage, https://​colab.​research.​google.​com/​, last accessed 2023/01/31.
 
2
Midjourney works basically as a Discord bot, an automated module within the instant messaging platform designed for communication between gamer communities and internet users. MJ is a tool best known for its artistic-style results.
 
3
The “node” logic is a user interface simplification gimmick already known in the Maya Hypergraph language. In the early 2000s, the lack of agreement with Bentley Systems gave way to the development of Grasshopper (David Rutten, 2008) strongly detaching from the GenerativeComponents project (John Nastasi, 2003) forerunner of VPL platforms.
 
4
GAN are two neural networks: a generator that makes new data and a discriminator that tells real data from fake data. They are trained in a game where the generator tries to fool the discriminator and the discriminator tries to catch the generator.
 
5
The other two modes are the Supervised Learning (SL), that uses labeled data to train a model and the Unsupervised Learning (UL), that uses unlabeled data to find patterns and structure. Anyway, exist also another mode defined as Self-supervised learning (SSL), that is a kind of UL that creates artificial labels from raw data.
 
6
Transformers, introduced in 2017, are a type of neural network for processing sequential data such as text. They use a self-attention mechanism to understand the meaning of words in context by weighing their importance in the input sequence.
 
7
Required Python libraries: atoolkitdpt, opencv-python, Pillow, numpy, matplotlib, transformers, torch, torchvision, imutils, open3d, openai, stability-sdk, timm. The documentation is reported at, https://​bit.​ly/​LA-WYSIWYTfromDPTto​3D and will be publicly shared through ATk v1.1.8/v1.1.9.
 
8
The Stable Diffusion engines included in ATk are: v1, v1.5, v2, v2.1, inpainting v1 and inpainting v2.
 
9
Sampler models included in ATk are: ddim, ddpm, k-dpm 2, k-dpm 2 ancestral, k-dpmpp 2m, k-dpmpp s2 ancestral, k-euler, k-euler ancestral, k-heun, k-lms.
 
10
Video demo, https://​youtu.​be/​1VXe-R44nuw, last accessed 2023/02/13.
 
11
PyPI repository link, https://​pypi.​org/​project/​atoolkitdpt/​, last accessed 2023/02/10.
 
12
Global–Local Path Networks for Monocular Depth Estimation with Vertical CutDepth, https://​github.​com/​vinvino02/​GLPDepth, last accessed 2023/02/10.
 
13
DPT weights models for the semantic segmentation, https://​github.​com/​isl-org/​DPT, last accessed 2023/02/10.
 
14
A further example regarding the variation is provided by the author in this video demo, https://​youtu.​be/​UPTc5o6rOLM, last accessed 2023/02/13.
 
15
A further example regarding the editing is provided by the author in this video demo, https://​youtu.​be/​nmJTP1QvuIE, last accessed 2023/02/13.
 
16
Video demo, https://​bit.​ly/​LA-DPTto3Dobj, last accessed 2023/02/21.
 
17
In the last few months Ambrosinus-Toolkit has received several implementations, including the possibility of using Stable Diffusion on a local machine and as a rendering engine, thanks to the project called “Automatic1111”. ATk development link, https://​bit.​ly/​SDandCNinsideGH, last accessed 2023/06/23.
 
Literatur
1.
Zurück zum Zitat Domingos P (2015) The master algorithm: how the quest for the ultimate learning machine will remake our world. First paperback, Basic Books, New York Domingos P (2015) The master algorithm: how the quest for the ultimate learning machine will remake our world. First paperback, Basic Books, New York
4.
Zurück zum Zitat Chaillou S (2020) ArchiGAN: artificial intelligence x architecture. In: Architectural intelligence selected papers from the 1st international conference on computational design and robotic fabrication (CDRF 2019). Springer, Singapore, pp 117–127 Chaillou S (2020) ArchiGAN: artificial intelligence x architecture. In: Architectural intelligence selected papers from the 1st international conference on computational design and robotic fabrication (CDRF 2019). Springer, Singapore, pp 117–127
7.
Zurück zum Zitat The State of AI Report analyses the most interesting developments in AI. The Report is produced by AI investors Nathan Benaich and Ian Hogarth. https://www.stateof.ai/. Accessed 2 Feb 2023 The State of AI Report analyses the most interesting developments in AI. The Report is produced by AI investors Nathan Benaich and Ian Hogarth. https://​www.​stateof.​ai/​. Accessed 2 Feb 2023
10.
Zurück zum Zitat Hy L (2022) Large-scale artificial intelligence models. Computer 55(05):76–80 Hy L (2022) Large-scale artificial intelligence models. Computer 55(05):76–80
15.
Zurück zum Zitat Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. ArXiv abs/2006.11239. Accessed 3 Feb 2023 Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. ArXiv abs/2006.11239. Accessed 3 Feb 2023
16.
Zurück zum Zitat Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M (2021) GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741. Accessed 2 Feb 2023 Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M (2021) GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:​2112.​10741. Accessed 2 Feb 2023
17.
Zurück zum Zitat Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with CLIP Latents. ArXiv abs/2204.06125. Accessed 2 Feb 2023 Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with CLIP Latents. ArXiv abs/2204.06125. Accessed 2 Feb 2023
19.
Zurück zum Zitat Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2021) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10674–10685 Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2021) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10674–10685
21.
Zurück zum Zitat Ranftl R, Lasinger K, Hafner D, Schindler K, Koltun V (2019) Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. In: Proceedings of the IEEE transactions on pattern analysis and machine intelligence, vol 44, pp 1623–1637 Ranftl R, Lasinger K, Hafner D, Schindler K, Koltun V (2019) Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. In: Proceedings of the IEEE transactions on pattern analysis and machine intelligence, vol 44, pp 1623–1637
22.
Zurück zum Zitat Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. Lecture notes in computer science, vol 7576. Springer, Berlin, Heidelberg, pp 746–760 Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. Lecture notes in computer science, vol 7576. Springer, Berlin, Heidelberg, pp 746–760
25.
Zurück zum Zitat Ambrosini L (2018) Data, digital & design—Produzione del progetto digitale e processi decisionali: la progettazione “flessibile” nell’Era dello Scripting e del Building Information Modelling come nuovo paradigma tecnologico. PhD thesis in Architecture XXXI cycle, DiARC, University of Naples “Federico II”. https://doi.org/10.13140/RG.2.2.27158.29769 Ambrosini L (2018) Data, digital & design—Produzione del progetto digitale e processi decisionali: la progettazione “flessibile” nell’Era dello Scripting e del Building Information Modelling come nuovo paradigma tecnologico. PhD thesis in Architecture XXXI cycle, DiARC, University of Naples “Federico II”. https://​doi.​org/​10.​13140/​RG.​2.​2.​27158.​29769
26.
Zurück zum Zitat Turchi T, Carta S, Ambrosini L, Malizia A (2023) Human-AI co-creation: evaluating the impact of large-scale text-to-image generative models on the creative process. In: Spano LD, Schmidt A, Santoro C, Stumpf S (eds) End-user development. IS-EUD 2023. Lecture notes in computer science, vol 13917. Springer, Cham. https://doi.org/10.1007/978-3-031-34433-6_3 Turchi T, Carta S, Ambrosini L, Malizia A (2023) Human-AI co-creation: evaluating the impact of large-scale text-to-image generative models on the creative process. In: Spano LD, Schmidt A, Santoro C, Stumpf S (eds) End-user development. IS-EUD 2023. Lecture notes in computer science, vol 13917. Springer, Cham. https://​doi.​org/​10.​1007/​978-3-031-34433-6_​3
Metadaten
Titel
Ambrosinus-Toolkit Plugin: Artificial Intelligence Text-to-Image Generative Models Through Grasshopper
verfasst von
Luciano Ambrosini
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-47913-7_1