This is Part 5. There is also Part 1, Part 2, Part 3, Part 4, Part 6, Part 7 and Part 8.
This post continues listing the Text-to-Image scripts included with Visions of Chaos and some example outputs from each script.
Name: Multi-Perceptor CLIP Guided Diffusion Secondary Model Method
Author: SOMNAI
Original script: https://colab.research.google.com/drive/1Pf5F84FzWe9iAKNbiPaEo_v4hvQZ9SqS
Time for 512×512 on a 3090: 7 minutes 23 seconds
Maximum resolution on a 24 GB 3090: 1792×768 or 2048×640.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: The winner for the longest name so far. Needs tweaking as the addition of the secondary model here reduces the usual excellent quality of the Multi-Perceptor CLIP Guided Diffusion. Still shows a lot of potential.
a 3D render of Robocop
a futuristic city IMAX
a matte painting of trypophobia
a renaissance painting of a cloudy sunset trending on ArtStation
a woman 4K photo
an evil clown Flickr
an oil painting of a nightmare creature by Louis Janmot
Indiana Jones
reflective spheres
zombies filmic
Name: Multi-Perceptor VQGAN+CLIP v2
Author: Remi Durant
Original script: https://colab.research.google.com/drive/1peZ98vBihDD9A1v7JdH5VvHDUuW5tcRK
Time for 512×512 on a 3090: 3 minutes 45 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Version 2 of Remi’s Multi-Perceptor VQGAN+CLIP script.
a babbling brook by Zhou Wenjing
a bedroom by Francesco Furini
a computer by Édouard Detaille
a cross stitch of a landscape vivid colors
a kitchen filmic
a matte painting of halloween
a pastel of a peacock
a storybook illustration of a kitchen by Lena Alexander
an oil on canvas painting of a zombie made of voxels
vector art of Darth Vader
Name: 360Diffusion
Author: @sadly_existent
Original script: https://colab.research.google.com/github/sadnow/360Diffusion/blob/main/360Diffusion_Public.ipynb
Time for 512×512 on a 3090: 2 minutes 50 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: 256×256 2 minutes 28 seconds
Description: A new diffusion based script. Capable of some interesting results
a bronze sculpture of a crying person by Auguste BaudBovy
a flemish baroque of a bouquet of flowers
a haunted house trending on ArtStation
a hyperrealistic painting of trypophobia by Xia Gui
a nightmare creature
a space nebula rendered in Cinema4D
a tentacle monster 4K HD realism
an oil on canvas painting of Danny Trejo by Pablo Rey
Frankenstein
heaven 8K 3D
Name: Multi-Perceptor VQGAN+CLIP v3
Author: Remi Durant
Original script: https://colab.research.google.com/drive/1peZ98vBihDD9A1v7JdH5VvHDUuW5tcRK
Time for 512×512 on a 3090: 3 minutes 38 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Version 3 of Remi’s Multi-Perceptor VQGAN+CLIP script.
a bronze sculpture of Gandalf
a clown made of clay
a detailed painting of a desert oasis
a house by Kathleen Guthrie
a peacock made of metal
a tilt shift photo of the Las Vegas strip
a watercolor painting of reflective spheres 8K 3D
an art deco painting of an amusement park
lineart of Big Bird by Alesso Baldovinetti
vector art of a forest fire
Name: FuseDream
Author: Xingchao Liu et al
Original script: https://github.com/gnobitab/FuseDream
Time for 512×512 on a 3090: 3 minutes 38 seconds
Maximum resolution on a 24 GB 3090: Locked to 512×512.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Gives some unique outputs compared to all the previous scripts.
a clown
a king
a matte painting of New York City by Robin Guthrie
a portrait of a young girl
a rough seascape
a sea monster
a teddy bear
a werewolf
an airbrush painting of an angry woman
an attractive woman
Name: Looking Glass
Author: bearsharktopus
Original script: https://colab.research.google.com/drive/11vdS9dpcZz2Q2efkOjcwyax4oob6N40G
Time for 265×256 on a 3090: 1 minute 19 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256.
Maximum resolution on an 8GB 2080: 256×256 2 minutes 03 seconds
Description: A variation on ruDALL-E that added support for training the output with a single image or directory of images. It does seem to create better results than the raw ruDALL-E scripts (starting from a single image of random Perlin noise).
a cemetery trending on pixiv
a colorful parrot
a photo of a house
a rough seascape
an alien city
an angry person by Eric Auld
an angry woman
an ugly woman
monkeys
Yoda
Name: Velocity Diffusion
Author: Katherine Crowson
Original script: https://github.com/crowsonkb/v-diffusion-pytorch
Time for 512×512 on a 3090: 3 minutes 57 seconds
Maximum resolution on a 24 GB 3090: 896×512 or 640×640.
Maximum resolution on an 8GB 2080: 128×128 1 minute 19 seconds
Description: The latest script from Katherine Crowson. Unique results compared to her previous diffusion based scripts. Worth experimenting with further.
a detailed matte painting of traffic
a detailed painting of Jason Vorhees
a Ghostbuster
a manga drawing of a lounge room by Yayoi Kusama
a mountain range CryEngine
a portrait of a young girl made of feathers rendered in unreal engine
a zombie
lineart of a Rubiks cube
The Grinch
vector art of Emporer Palpatine
Name: ruDALL-E Arbitrary Resolution v1
Author: @nev
Original script: https://colab.research.google.com/drive/1DbqOIUIVBPOrJ4MeaV4YkAlb7ilWQjKZ
Time for 512×512 on a 3090: 4 minutes 40 seconds
Maximum resolution on a 24 GB 3090: 1024×1024
Maximum resolution on an 8GB 2080: 768×768 16 minutes 34 seconds
Description: Allows larger resolution images using the ruDALL-E model. Very nice results and supports larger resolutions on GPUs with less VRAM.
a color pencil sketch of a werewolf
a colorful parrot
a gorilla
a painting of a cabin next to a stream in a secluded forest
a portrait of a girl with a dragon tattoo
a rose vivid colors
a sketch of an ugly man
a surrealist sculpture of a submarine
dense woodland
medusa
Name: ruDALL-E Arbitrary Resolution v2
Author: @nev
Original script: https://colab.research.google.com/drive/1DbqOIUIVBPOrJ4MeaV4YkAlb7ilWQjKZ
Time for 512×512 on a 3090: 4 minutes 40 seconds
Maximum resolution on a 24 GB 3090: 1024×1024
Maximum resolution on an 8GB 2080: 768×768 15 minutes 48 seconds
Description: v2 of the ruDALL-E Arbitrary Resolution script. Allows larger resolution images using the ruDALL-E model. Very nice results and supports larger resolutions on GPUs with less VRAM.
a bouquet of flowers
a cross stitch of a well kept garden
a futuristic city
a large waterfall
a minimalist painting of a castle in the mountains
a photocopy of a monkey vivid colors
a spooky forest by Laura Muntz Lyall
a teddy bear made of wrought iron
dense woodland
God
Name: GLIDE
Author: Unknown
Original script: https://colab.research.google.com/github/openai/glide-text2im/blob/main/notebooks/text2im.ipynb
Time for 256×256 on a 3090: 23 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256
Maximum resolution on an 8GB 2080: Locked to 256×256
Description: Images are rendered tiny at 64×64 and then upscaled internally within the script to 256×256 for ouput. The model has been “trimmed” so it cannot do anything human related and only does well for subjects it knows about. Hopefully they release the full model and/or train a larger resolutioon model in the future. Nothing to get excited about yet.
a cathedral
a color pencil sketch of a fire breathing dragon by Erwin Bowien
a gorilla
a library
a mosaic of monkeys
a painting of a cabin next to a stream in a secluded forest
an elephant
dinosaurs
goldfish
the Sydney Harbour Bridge lens flare
Name: Disco Diffusion
Author: @Somnai
Original script: https://colab.research.google.com/drive/1bItz4NdhAPHg5-u87KcH-MmJZjK-XqHN
Time for 512×512 on a 3090: 3 minutes 18 seconds
Maximum resolution on a 24 GB 3090: 2496×1088 11 minutes 50 seconds
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Diffusion script that includes all the latest features. Capable of rendering some very nice large resolution images (it may even do better at larger sized images than smaller resolutions like these samples).
a cute creature
a detailed matte painting of a morning landscape
a peacock made of mist by Reinier Nooms
a Pokemon character by William Etty
a polaroid photo of an angry woman
a rough seascape
a watercolor painting of a mountain path by Mark A Brennan rendered in Cinema4D
an attractive woman
computer rendering of a desert oasis rendered in unreal engine
\
the Amazon Rainforest by Qian Du
Name: Infinite Diffusion
Author: https://github.com/crowsonkb/v-diffusion-pytorch
Original script: https://colab.research.google.com/drive/1VJrfInU5RbciXXD_8jzY-FntFqiyj6au
Time for 512×512 on a 3090: 3 minutes 32 seconds
Maximum resolution on a 24 GB 3090: 512×512
Maximum resolution on an 8GB 2080: 256×256 3 minutes 15 seconds
Description: Diffusion basecd script. Very VRAM hungry. Renders some unique images compared to the other methods.
cookie monster eating a cookie
a renaissance painting of a farm by Bernardo Strozzi
a silk screen of God
a storybook illustration of a cute monster trending on pixiv
a surrealist painting of Frankenstein
a watercolor painting of Yoda
a worried woman made of clay lens flare
an art deco painting of Luke Skywalker
an oil painting of Buzz Lightyear
Chewbacca
Name: minDALL-E
Author: Kakao Brain Corp
Original script: https://github.com/kakaobrain/minDALL-E
Time for 256×256 on a 3090: 1 minutes 59 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256
Maximum resolution on an 8GB 2080: 256×256 1 minute 59 seconds
Description: Another DALL-E variation script. Locked to 256×256 but can geenrate multiple images each run.
a cozy den
a digital painting of Chewbacca by Willem van de Velde the Elder
a sad person
a skull
a storybook illustration of a happy clown by Gwen Barnard
a tree by Colin Gill
Bugs Bunny
fireworks by Károly Lotz
The Grand Canyon
Yoda
Name: ruDOLPH
Author: SBER AI
Original script: https://github.com/sberbank-ai/ru-dolph
Time for 128×128 on a 3090: 1 minutes 15 seconds
Maximum resolution on a 24 GB 3090: Locked to 128×128
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another ruDALL-E variation script. Locked to a tiny 128×128 resolution for now until they train the larger models. These examples were 4x upscaled with Real ESRGAN.
a castle
a colorful parrot
a fine art painting of an ugly woman
a kitchen
a pastel of spirals made of plastic
a photorealistic painting of a cityscape
a portrait of a woman
a sad person by Ramon Casas i CarbÃ
kittens
vector art of a woman
Name: CLIP Guided Deep Image Prior
Author: Daniel Russell
Original script: https://colab.research.google.com/drive/1_oqIK8A67EgtJDdfsuJojc5ukNzirdle
Time for 512×512 on a 3090: 1 minutes 45 seconds
Maximum resolution on a 24 GB 3090: 1024×1024 or 1680×720
Maximum resolution on an 8GB 2080: 512×512 (5 minutes 7 seconds) or 640×360
Description: Interesting script that has decent coherency. If only the output was slightly sharper and the colors slightly richer it would be a winner. Still good for unique outputs that the other methods cannot achieve.
a flemish baroque of a shrine
a statue of a tardigrade made of clay
a surrealist painting of a Pixar character
a surrealist painting of an evening landscape 4K photo
an abstract sculpture of an evil clown by Han Gan
an ambient occlusion render of Bugs Bunny made of wood
Cookie Monster
Jabba The Hutt by Shūbun Tenshō
tentacles by Johanna Marie Fosie
vector art of heaven
Any Others I Missed?
Do you know of any other colabs and/or github Text-to-Image systems I have missed? Let me know and I will see if I can convert them to work with Visions of Chaos for a future release. If you know of any public Discords with other colabs being shared let me know too.
Jason.
Attractive women seems to be the last barrier holding AI back and preventing the unavoidable takeover of entirety of an art industry 😀
Are there any AIs apart from Dalee Ru that somebody without 3090 can play with?
Thanks for your explorations (and Visions of Chaos)
Depends what GPU you have and how much VRAM it has.
I have added a stat for “Maximum resolution on an 8GB 2080” on each script so you can get an idea of which ones work with a lower VRAM GPU.
Jason.
Hey, love the software, I figured id comment to mention that disco has been updated to v4.1 and includes a bunch of new additions and such. I’ve been using the version in VoC but it would be neat to see it updated.
As for GPUs, I’ve had no issues running the disco 3.0 version in VoC on an RTX 3070 which has 8GB of VRAM. Atm, I’ve been running 4 models at 512×512, no crashes, and I’ve run over 90 prompts.
Thank you.
Disco Diffusion 4.1 will be in the next release of Visions of Chaos (v90.8).
To add to this, I’ve been using the custom size with v4.1, and can do 1280×512 with 8GB VRAM on a 2080 Super. With a seed image, I need to drop it to 1152×512.
The website is an indispensable resource & the software is simply amazing. Very much looking forward to Disco diffusion being updated.