Text-to-Image Summary – Part 4

This is Part 4. There is also Part 1, Part 2, Part 3, Part 5 and Part 6.

This post continues listing the Text-to-Image scripts included with Visions of Chaos and some example outputs from each script.


Name: PixelDraw
Author: dribnet
Original script: https://colab.research.google.com/github/dribnet/clipit/blob/master/demos/PixelDrawer.ipynb
Time for 512×512 on a 3090: 1 minutes 59 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Generates “pixel art” images. I had a lot of requests to add support for this one.

'a cartoon of a peacock' PixelDraw Text-to-Image
a cartoon of a peacock

'a cloudy sunset' PixelDraw Text-to-Image
a cloudy sunset

'a gorilla' PixelDraw Text-to-Image
a gorilla

'a morning landcsape' PixelDraw Text-to-Image
a morning landscape

'a watercolor painting of a castle' PixelDraw Text-to-Image
a watercolor painting of a castle

'an art deco painting of Al Pacino' PixelDraw Text-to-Image
an art deco painting of Al Pacino

'Hell' PixelDraw Text-to-Image
Hell

'Shrek' PixelDraw Text-to-Image
Shrek


Name: DirectVisions
Author: Jens Goldberg
Original script: https://colab.research.google.com/drive/127lKSsQjx-UDDUSvIkLL6mREfZ0KQu5D
Time for 512×512 on a 3090: 2 minutes 39 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: 4096×4096
Description: Interesting detailed images. Can create huge resolution results.

'a color pencil sketch of a western town' DirectVisions Text-to-Image
a color pencil sketch of a western town

'a detailed painting of a cephalopod' DirectVisions Text-to-Image
a detailed painting of a cephalopod

'a digital rendering of an ugly face' DirectVisions Text-to-Image
a digital rendering of an ugly face

'a pencil sketch of Buzz Lightyear' DirectVisions Text-to-Image
a pencil sketch of Buzz Lightyear

'a rough seascape by Pinchus Kremegne' DirectVisions Text-to-Image
a rough seascape by Pinchus Kremegne

'a stock photo of a president' DirectVisions Text-to-Image
a stock photo of a president

'a sunset' DirectVisions Text-to-Image
a sunset

'an alien city' DirectVisions Text-to-Image
an alien city

'an alien forest by Helen Berman' DirectVisions Text-to-Image
an alien forest by Helen Berman

'an evening landscape' DirectVisions Text-to-Image
an evening landscape


Name: Pixel Direct
Author: Unknown
Original script: https://colab.research.google.com/drive/1F9ZOZnpV3uBPRDSESaAXYwzNZJQRJT75
Time for 512×512 on a 3090: 1 minutes 03 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: 2048×2048 1 minute 51 seconds
Description: Another “Pixel Art” script. More abstract results than the PixelDraw script above.

'a bronze sculpture of a nightmare creature' Pixel Direct Text-to-Image
a bronze sculpture of a nightmare creature

'a cartoon of Al Pacino' Pixel Direct Text-to-Image
a cartoon of Al Pacino

'a nightclub' Pixel Direct Text-to-Image
a nightclub

'a silk screen of a bouquet of flowers' Pixel Direct Text-to-Image
a silk screen of a bouquet of flowers

'an etching of a worried woman' Pixel Direct Text-to-Image
an etching of a worried woman

'an illustration of of a thunder storm' Pixel Direct Text-to-Image
an illustration of of a thunder storm


Name: FourierVisions
Author: Unknown
Original script: https://colab.research.google.com/drive/1nGNBjhbYnDHSumGPjpFHjDOsaZFAqGgF
Time for 512×512 on a 3090: 1 minutes 40 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: 1024×1024 4 minutes 07 seconds
Description: Detailed images. The default script generates washed out pastel images, but with some gamma and brightness tweaks they can be improved (still not ideal, but better). Allows very large resolution images.

'a cathedral' FourierVisions Text-to-Image
a cathedral

'a charcoal drawing of zombies' FourierVisions Text-to-Image
a charcoal drawing of zombies

'a detailed painting of a sunset by Thomas Cantrell Dugdale' FourierVisions Text-to-Image
a detailed painting of a sunset by Thomas Cantrell Dugdale

'a ghost made of mist' FourierVisions Text-to-Image
a ghost made of mist

'a kitchen' FourierVisions Text-to-Image
a kitchen

'a movie monster' FourierVisions Text-to-Image
a movie monster

'a pencil sketch of a sad clown' FourierVisions Text-to-Image
a pencil sketch of a sad clown

'a werewolf' FourierVisions Text-to-Image
a werewolf

'an evil clown by Viktor Oliva' FourierVisions Text-to-Image
an evil clown by Viktor Oliva

'an ink drawing of an ugly monster' FourierVisions Text-to-Image
an ink drawing of an ugly monster


Name: PyramidVisions
Author: Unknown
Original script: https://colab.research.google.com/drive/1dpAS_wK34y7c6s-CatAFmBtbkjGT_erM
Time for 512×512 on a 3090: 3 minutes 08 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: 1024×1024 10 minutes 48 seconds
Description: Very detailed images. Not the fastest script, but gives some very nice results. Lower VRAM requirements so good for lesser spec GPUs. Definitely one of the better scripts worth exploring.

'a desert oasis' PyramidVisions Text-to-Image
a desert oasis

'a lush rainforest' PyramidVisions Text-to-Image
a lush rainforest

'a marble sculpture of an angry person' PyramidVisions Text-to-Image
a marble sculpture of an angry person

'a minimalist painting of the Amazon Rainforest' PyramidVisions Text-to-Image
a minimalist painting of the Amazon Rainforest

'a nightmare creature' PyramidVisions Text-to-Image
a nightmare creature

'a pastel of a computer made of paper' PyramidVisions Text-to-Image
a pastel of a computer made of paper

'an abstract sculpture of a sad clown' PyramidVisions Text-to-Image
an abstract sculpture of a sad clown

'an acrylic painting of an alien forest | vivid colors' PyramidVisions Text-to-Image
an acrylic painting of an alien forest | vivid colors

'Medusa' PyramidVisions Text-to-Image
Medusa

'vector art of an ugly woman' PyramidVisions Text-to-Image
vector art of an ugly woman


Name: Visions of AI v1
Author: Jason Rampe
Original script: Included with Visions of Chaos. No colab.
Time for 512×512 on a 3090: 1 minutes 32 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480.
Maximum resolution on an 8GB 2080: 256×256 1 minute 33 seconds
Description: My first attempt at actually creating a Text-to-Image script. Based on the excellent example from Jonathan Whitaker‘s AIAIArt Lesson 3 tutorial. Gives some very nice fine detail in some areas, but suffers the non coherance of other scripts in that it creates multiple copies of the subject throughout the image. After actually trying to write my own script I only have more respect for those who can do this. Hopefully I can improve these results for a version 2. In the meantime, here are some sample from the current Visions of AI script.

'a cartoon of the human condition by Judy Takács' Visions of AI Text-to-Image
a cartoon of the human condition by Judy Takács

'a cubist painting of an evening landscape' Visions of AI Text-to-Image
a cubist painting of an evening landscape

'a digital rendering of frogs' Visions of AI Text-to-Image
a digital rendering of frogs

'a fire breathing dragon' Visions of AI Text-to-Image
a fire breathing dragon

'a hyperrealistic painting of a movie monster' Visions of AI Text-to-Image
a hyperrealistic painting of a movie monster

'a morning landscape' Visions of AI Text-to-Image
a morning landscape

'a shark' Visions of AI Text-to-Image
a shark

'a woodcut of an ugly man' Visions of AI Text-to-Image
a woodcut of an ugly man

'an airbrush painting of C-3PO' Visions of AI Text-to-Image
an airbrush painting of C-3PO

'Frankenstein' Visions of AI Text-to-Image
Frankenstein


Name: Visions of AI v2
Author: Jason Rampe
Original script: Included with Visions of Chaos. No colab.
Time for 512×512 on a 3090: 2 minutes 35 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480.
Maximum resolution on an 8GB 2080: 256×256 2 minutes 36 seconds
Description: An attempt to improve the coherency of the previous script. The first 30 iterations zoom into the image every 10 frames. This results in larger shapes/blobs for the rest of the script to work from. The idea is that it will give larger subjects compared to the v1 script. Kind of works. Gives blurrier results. To be fixed in the next version?

'a morning landscape by William Gear' Visions of AI v2 Text-to-Image
a morning landscape by William Gear

'a raytraced image of a nightclub lens flare' Visions of AI v2 Text-to-Image
a raytraced image of a nightclub lens flare

'a tentacle monster by Carlo Crivelli' Visions of AI v2 Text-to-Image
a tentacle monster by Carlo Crivelli

'a woodcut of a worried woman by Li Keran' Visions of AI v2 Text-to-Image
a woodcut of a worried woman by Li Keran

'an illustration of of a cave made of cheese' Visions of AI v2 Text-to-Image
an illustration of of a cave made of cheese

'Cthulhu' Visions of AI v2 Text-to-Image
Cthulhu

'cyberpunk art of a futuristic city' Visions of AI v2 Text-to-Image
cyberpunk art of a futuristic city

'goldfish' Visions of AI v2 Text-to-Image
goldfish

'reflective spheres' Visions of AI v2 Text-to-Image
reflective spheres

'the Australian outback' Visions of AI v2 Text-to-Image
the Australian outback


Name: Multi-Perceptor CLIP Guided Diffusion
Author: Varkarrus
Original script: https://colab.research.google.com/drive/1y3Vt39A5KSNFRa6Z2bCqDHxteZSVH9NC
Time for 512×512 on a 3090: 3 minutes 08 seconds
Maximum resolution on a 24 GB 3090: 896×512 or 1152×384 (dimensions must be divisible by 128).
Maximum resolution on an 8GB 2080: 128×128 1 minute 56 seconds
Description: Builds upon previous CLIP Guided Diffusion scripts. Like the previous script by Dango233 it uses three CLIP models simultaneously to “rate” the generated images, and I have added options to use up to six different CLIP models. The resulting image accuracy compared to the prompt, and the resulting image coherence seem to be much better than previous CLIP Guided Diffusion scripts that could almost have random outputs sometimes. This script is superb and highly recommended. Great lighting, textures and brushstrokes. Normally with these blog posts I do a batch run of random prompts overnight and then pick the best 10 images. In this case I had nearly 50 images in my “good” folder after going through the batch results. So, for this script I am showing 20 sample images.

'a cute creature | TriX 400 TX' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a cute creature | TriX 400 TX

'a digital painting of Frankenstein by Kanzan Shimomura' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a digital painting of Frankenstein by Kanzan Shimomura

'a morning landscape by János SaxonSzász' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a morning landscape by János SaxonSzász

'a nightmare creature' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a nightmare creature

'a photorealistic painting of a teddy bear' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a photorealistic painting of a teddy bear

'a portrait of a young girl' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a portrait of a young girl

'a space nebula | IMAX' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a space nebula | IMAX

'a worried man' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a worried man

'a zombie by Nathaniel Hone' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
a zombie by Nathaniel Hone

'an acrylic painting of a spider by Abram Arkhipov' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
an acrylic painting of a spider by Abram Arkhipov

'an airbrush painting of a monkey by Jeremy Henderson' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
an airbrush painting of a monkey by Jeremy Henderson

'an alien landscape' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
an alien landscape

'an ugly creature made of insects' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
an ugly creature made of insects

'an ultrafine detailed painting of a sad person | ZBrush' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
an ultrafine detailed painting of a sad person | ZBrush

'Arnold Schwarzenegger | trending on ArtStation' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
Arnold Schwarzenegger | trending on ArtStation

'concept art of Robocop' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
concept art of Robocop

'dinosaurs' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
dinosaurs

'Dracula | CGSociety' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
Dracula | CGSociety

'flesh made of insects' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
flesh made of insects

'God by William Simpson' Multi-Perceptor CLIP Guided Diffusion Text-to-Image
God by William Simpson


Name: Pixel MultiColors
Author: Remi Durant
Original script: https://colab.research.google.com/drive/17c-13cl_VQKpHq2rDrnFVi6ZT-CHeZNn
Time for 512×512 on a 3090: 0 minutes 44 seconds
Maximum resolution on a 24 GB 3090: 4096×4096.
Maximum resolution on an 8GB 2080: 2048×2048 7 minutes 45 seconds
Description: Very noisy/pixelated/abstract results. The default script gives dark images which some tweaks to brightness and contrast can help. Maybe a little bit of blur could help too in a future revision. It is fast though, and can support huge image sizes.

'a charcoal drawing of a cute creature made of metal' Pixel MultiColors Text-to-Image
a charcoal drawing of a cute creature made of metal

'a farm' Pixel MultiColors Text-to-Image
a farm

'a forest path by Walter Leighton Clark' Pixel MultiColors Text-to-Image
a forest path by Walter Leighton Clark

'a lighthouse' Pixel MultiColors Text-to-Image
a lighthouse

'a surrealist painting of a beachside resort' Pixel MultiColors Text-to-Image
a surrealist painting of a beachside resort

'a well kept garden' Pixel MultiColors Text-to-Image
a well kept garden

'an abstract sculpture of Pikachu' Pixel MultiColors Text-to-Image
an abstract sculpture of Pikachu

'an art deco painting of a volcano' Pixel MultiColors Text-to-Image
an art deco painting of a volcano

'an ink drawing of tentacles' Pixel MultiColors Text-to-Image
an ink drawing of tentacles

'an octopus Rendered in Cinema4D' Pixel MultiColors Text-to-Image
an octopus Rendered in Cinema4D


Name: Ultraquick CLIP Guided Diffusion
Author: @sadly_existent
Original script: https://colab.research.google.com/github/sadnow/360Diffusion/blob/main/360Diffusion_AlphaTesting.ipynb
Time for 512×512 on a 3090: 1 minute 57 seconds
Maximum resolution on a 24 GB 3090: Locked to either 256×256 or 512×512.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another CLIP Guided Diffusion script. Can give some interesting results.

'a cave' Pixel MultiColors Text-to-Image
a cave

'a color pencil sketch of Cthulhu' Pixel MultiColors Text-to-Image
a color pencil sketch of Cthulhu

'a detailed painting of Shrek' Pixel MultiColors Text-to-Image
a detailed painting of Shrek

'a flemish baroque of the human condition by George Barret Jr' Pixel MultiColors Text-to-Image
a flemish baroque of the human condition by George Barret Jr

'a low poly render of halloween' Pixel MultiColors Text-to-Image
a low poly render of halloween

'a photorealistic painting of a worried woman made of paper by Ann Thetis Blacker' Pixel MultiColors Text-to-Image
a photorealistic painting of a worried woman made of paper by Ann Thetis Blacker

'a surrealist painting of a worried man' Pixel MultiColors Text-to-Image
a surrealist painting of a worried man

'a surrealist sculpture of an angry man 8K 3D' Pixel MultiColors Text-to-Image
a surrealist sculpture of an angry man 8K 3D

'Robocop' Pixel MultiColors Text-to-Image
Robocop

'zombies' Pixel MultiColors Text-to-Image
zombies


Name: ruDALL-E
Author: @sadly_existent
Original script: https://colab.research.google.com/drive/1wGE-046et27oHvNlBNPH07qrEQNE04PQ
Optimized script: https://colab.research.google.com/drive/1euIMG8E6kSFA2nU58LqrVsq6nbXjqELY
Time for 256×256 on a 3090: 1 minute 05 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256.
Maximum resolution on an 8GB 2080: Cannot run on 8GB VRAM
Description: Russian version of DALL-E. Only takes text prompts in Russian, so I do some auto English to Russian translations. Locked to small 256×256 images at this stage, but can create some interesting results.

'a hyperrealistic painting of Chewbacca by Edith Grace Wheatley' ruDALL-E Text-to-Image
a hyperrealistic painting of Chewbacca by Edith Grace Wheatley

'a low poly render of Pikachu' ruDALL-E Text-to-Image
a low poly render of Pikachu

'a man' ruDALL-E Text-to-Image
a man

'a rose' ruDALL-E Text-to-Image
a rose

'a stock photo of puppies' ruDALL-E Text-to-Image
a stock photo of puppies

'egyptian art of a portrait of a woman' ruDALL-E Text-to-Image
egyptian art of a portrait of a woman

'Harry Potter' ruDALL-E Text-to-Image
Harry Potter

'Indiana Jones' ruDALL-E Text-to-Image
Indiana Jones

'Robocop made of gold' ruDALL-E Text-to-Image
Robocop made of gold

'Yoda' ruDALL-E Text-to-Image
Yoda


Name: ruVQGAN+CLIP
Author: nev
Original script: https://colab.research.google.com/drive/1wAnIHocDYFAbWtA7rk8C7cFEUdRyLzwZ
Time for 512×512 on a 3090: 1 minute 28 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: 256×256 1 minute 27 seconds
Description: Creates fairly blurry results. Even with post process sharpening. If anyone could get these results crisper it would be really improve the output.

'a 3D render of a wizard by Gertrude Greene' ruVQGAN+CLIP Text-to-Image
a 3D render of a wizard by Gertrude Greene

'a cubist painting of a Pokemon character' ruVQGAN+CLIP Text-to-Image
a cubist painting of a Pokemon character

'a cute creature' ruVQGAN+CLIP Text-to-Image
a cute creature

'a matte painting of halloween by Carlos Trillo Name' ruVQGAN+CLIP Text-to-Image
a matte painting of halloween by Carlos Trillo Name

'a photorealistic painting of an alien landscape by Jacob Ochtervelt' ruVQGAN+CLIP Text-to-Image
a photorealistic painting of an alien landscape by Jacob Ochtervelt

'a rough seascape filmic' ruVQGAN+CLIP Text-to-Image
a rough seascape filmic

'a sea monster' ruVQGAN+CLIP Text-to-Image
a sea monster

'a woodcut of a skull by Gu Hongzhong trending on ArtStation' ruVQGAN+CLIP Text-to-Image
a woodcut of a skull by Gu Hongzhong trending on ArtStation

'Cthulhu' ruVQGAN+CLIP Text-to-Image
Cthulhu

'trypophobia' ruVQGAN+CLIP Text-to-Image
trypophobia


Name: Multi-Perceptor VQGAN+CLIP
Author: Remi Durant
Original script: https://colab.research.google.com/drive/1peZ98vBihDD9A1v7JdH5VvHDUuW5tcRK
Time for 512×512 on a 3090: 2 minute 30 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: As with the previous Multi-Perceptor CLIP Guided Diffusion scripts this one allows two different CLIP models to be used to rate the VQGAN output images. VQGAN is not going to beat diffusion for image coherance, but this script can give some very nice lighting and fine details in images.

'a bronze sculpture of an evil clown made of clay by Dionisio Baixeras Verdaguer' Multi-Perceptor VQGAN+CLIP Text-to-Image
a bronze sculpture of an evil clown made of clay by Dionisio Baixeras Verdaguer

'a fantasy land by Shigeru Aoki' Multi-Perceptor VQGAN+CLIP Text-to-Image
a fantasy land by Shigeru Aoki

'a hyperrealistic painting of puppies' Multi-Perceptor VQGAN+CLIP Text-to-Image
a hyperrealistic painting of puppies

'a midnineteenth century engraving of the Sydney Opera House' Multi-Perceptor VQGAN+CLIP Text-to-Image
a midnineteenth century engraving of the Sydney Opera House

'a statue of reflective spheres' Multi-Perceptor VQGAN+CLIP Text-to-Image
a statue of reflective spheres

'a surrealist painting of a tropical beach' Multi-Perceptor VQGAN+CLIP Text-to-Image
a surrealist painting of a tropical beach

'an alien city CGSociety' Multi-Perceptor VQGAN+CLIP Text-to-Image
an alien city CGSociety

'an oil painting of a fire breathing dragon' Multi-Perceptor VQGAN+CLIP Text-to-Image
an oil painting of a fire breathing dragon

'computer rendering of a well kept garden by Norman Garstin ZBrush' Multi-Perceptor VQGAN+CLIP Text-to-Image
computer rendering of a well kept garden by Norman Garstin ZBrush

'war CryEngine' Multi-Perceptor VQGAN+CLIP Text-to-Image
war CryEngine


Name: Hypertron
Author: Philipuss
Original script: https://colab.research.google.com/drive/10fa8X6EsfZfda1dfhJ_BtfPZ7Te1WGoX
Time for 512×512 on a 3090: 2 minute 00 seconds
Maximum resolution on a 24 GB 3090: 1120×480.
Maximum resolution on an 8GB 2080: 256×256 1 minute 35 seconds
Description: Another VQGAN based script. Has various “flavors” to give different results. Works OK. Can give the “image in a sea of purple/grey” that previous MSE based scripts suffered from. Still worth a try.

'a black and white photo of a fireman' Hypertron Text-to-Image
a black and white photo of a fireman

'a cute monster by Józef Mehoffer' Hypertron Text-to-Image
a cute monster by Józef Mehoffer

'a matte painting of a forest clearing' Hypertron Text-to-Image
a matte painting of a forest clearing

'a pop art painting of a human' Hypertron Text-to-Image
a pop art painting of a human

'a renaissance painting of a ghost by Jan van de Cappelle film' Hypertron Text-to-Image
a renaissance painting of a ghost by Jan van de Cappelle film

'a sea monster made of metal' Hypertron Text-to-Image
a sea monster made of metal

'a tattoo of a zombie' Hypertron Text-to-Image
a tattoo of a zombie

'a watercolor painting of a dragon Flickr' Hypertron Text-to-Image
a watercolor painting of a dragon Flickr

'an art deco painting of a haunted house by Mary Cameron' Hypertron Text-to-Image
an art deco painting of a haunted house by Mary Cameron

'concept art of a mountainscape by Maximilian Cercha' Hypertron Text-to-Image
concept art of a mountainscape by Maximilian Cercha


Name: CLIP Guided Diffusion Secondary Model Method
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/1mpkrhOjoyzPeSWy2r7T8EYRaU7amYOOi
Time for 512×512 on a 3090: 2 minute 28 seconds
Maximum resolution on a 24 GB 3090: 1792×768 or 2048×640.
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: A new diffusion based script from Katherine Crowson including a new “secondary model” she trained. Capable of some unique results with good textures and lighting.

'a detailed painting of Fozzy Bear by LeConte Stewart' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a detailed painting of Fozzy Bear by LeConte Stewart

'a flemish baroque of a happy person trending on pixiv' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a flemish baroque of a happy person trending on pixiv

'a flock of birds' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a flock of birds

'a Ghostbuster CGSociety' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a Ghostbuster CGSociety

'a kitchen made of cheese' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a kitchen made of cheese

'a nightmare creature' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a nightmare creature

'a photorealistic painting of The Grinch' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a photorealistic painting of The Grinch

'a portrait of a woman' CLIP Guided Diffusion Secondary Model Method Text-to-Image
a portrait of a woman

'an art deco painting of a sad clown' CLIP Guided Diffusion Secondary Model Method Text-to-Image
an art deco painting of a sad clown

'an oil painting of a nightmare' CLIP Guided Diffusion Secondary Model Method Text-to-Image
an oil painting of a nightmare



Any Others I Missed?

Do you know of any other colabs and/or github Text-to-Image systems I have missed? Let me know and I will see if I can convert them to work with Visions of Chaos for a future release. If you know of any public Discords with other colabs being shared let me know too.

Jason.

Text-to-Image Summary – Part 3

This is Part 3. There is also Part 1, Part 2, Part 4, Part 5 and Part 6.

This post continues listing the Text-to-Image scripts included with Visions of Chaos and some example outputs from each script.


Name: CLIP Guided Diffusion v4
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/1V66mUeJbXrTuQITvJunvnWVn96FEbSI3
Time for 512×512 on a 3090: 3 minutes 05 seconds
Maximum resolution on a 24 GB 3090: Locked to 512×512
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another CLIP Guided Diffusion script. Locked to 512×512 resolution. Like the other CLIP Diffusion scripts, some of the results can be very detailed and interesting, but a lot of time it is hit and miss to get a result that reliably matches the input phrase. When it gets a “hit” it can create very detailed impressive results, but the amount of “misses” stops it from getting a great rating. Still worth a try if you have the patience to run a large batch of images waiting for the best results. The following samples came hand picked from a large batch run of random prompt phrases.

'a forest clearing' CLIP Guided Diffusion v4 Text-to-Image
a forest clearing

'a storybook illustration of a nightmare' CLIP Guided Diffusion v4 Text-to-Image
a storybook illustration of a nightmare

'an impressionist painting of a cemetery' CLIP Guided Diffusion v4 Text-to-Image
an impressionist painting of a cemetery

'Harry Potter in the style of Rembrandt' CLIP Guided Diffusion v4 Text-to-Image
Harry Potter in the style of Rembrandt

'a detailed painting of a witch' CLIP Guided Diffusion v4 Text-to-Image
a detailed painting of a witch

'a babbling brook' CLIP Guided Diffusion v4 Text-to-Image
a babbling brook

'a desert oasis' CLIP Guided Diffusion v4 Text-to-Image
a desert oasis

'a hyperrealistic painting of an android' CLIP Guided Diffusion v4 Text-to-Image
a hyperrealistic painting of an android

'eyeballs' CLIP Guided Diffusion v4 Text-to-Image
eyeballs

'a cross stitch of Buzz Lightyear' CLIP Guided Diffusion v4 Text-to-Image
a cross stitch of Buzz Lightyear


Name: CLIP Guided Decision Transformer
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/1V66mUeJbXrTuQITvJunvnWVn96FEbSI3
Time for 512×512 on a 3090: 1 minutes 13 seconds
Maximum resolution on a 24 GB 3090: Locked to 384×384
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another one from Katherine Crowson. Some of the results can be very detailed and interesting, but a lot of time it is hit and miss to get a result that reliably matches the input phrase. When it gets a “hit” it can create very detailed impressive results, but the amount of “misses” stops it from getting a great rating. The following samples came hand picked from a large batch run of random prompt phrases.
Another good point for CLIP Decsision Transformer is that it will generate a batch of images from each run. So rather than a single image for the prompt text you can specify (for example) 8 images to be generated from the prompt. This allows a much larger set of images to be quickly generated to find those great outputs in.
For these images I have enhanced the resolution 4x using Real-ESRGAN (the thumnails are the original output images and the clicked images are resized x4).

a detailed painting of a palace by Thomas Kinkade
a detailed painting of a palace by Thomas Kinkade

a drawing of Chewbacca
a drawing of Chewbacca

a forest path
a forest path

a renaissance painting of a mountain range
a renaissance painting of a mountain range

a rough seascape
a rough seascape

a rough seascape
a rough seascape

a spooky forest
a spooky forest

an oil on canvas painting of a western town
an oil on canvas painting of a western town

Frankenstein
Frankenstein

The Grand Canyon
The Grand Canyon


Name: CLIPIT
Author: dribnet
Original script: https://github.com/dribnet/clipit
Time for 512×512 on a 3090: 2 minutes 38 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another GAN+CLIP script. Gives nice results that tend to match the prompt text more closely. This one is heavy on VAM usage.

'a happy family by Piet Mondiran' CLIPIT
a happy family by Piet Mondiran

'a landscape' CLIPIT
a landscape

'a peacock' CLIPIT
a peacock

'a tropical beach by Thomas Kinkade' CLIPIT
a tropical beach by Thomas Kinkade

'a woodcut of Dracula' CLIPIT
a woodcut of Dracula

'an ambient occlusion render of a zombie' CLIPIT
an ambient occlusion render of a zombie

'eyeballs in the style of Claude Monet' CLIPIT
eyeballs in the style of Claude Monet


Name: Art Machine
Author: Hillel Wayne
Original script: https://colab.research.google.com/drive/1n_xrgKDlGQcCF6O-eL3NOd_x4NSqAUjK
Time for 512×512 on a 3090: 4 minutes 04 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 1 minute 50 seconds
Description: Another VQGAN+CLIP scipt.

'a charcoal drawing of a kitchen' Art Machine
a charcoal drawing of a kitchen

'a mosaic of a mountain path | CryEngine' Art Machine
a mosaic of a mountain path | CryEngine

'a silk screen of a tropical beach in the style of Kandinsky' Art Machine
a silk screen of a tropical beach in the style of Kandinsky

'a woodcut of a nightmare creature' Art Machine
a woodcut of a nightmare creature

'an illustration of of a mountainscape' Art Machine
an illustration of of a mountainscape

'an ultrafine detailed painting of a green tree frog as created by Craig Mullins' Art Machine
an ultrafine detailed painting of a green tree frog as created by Craig Mullins

'Dracula' Art Machine
Dracula

'Planets' Art Machine
Planets


Name: VQGAN+CLIP v5
Author: Max Woolf
Original script: https://colab.research.google.com/drive/1wkF67ThUz37T2_oPIuSwuO4e_-0vjaLs
Time for 512×512 on a 3090: 2 minutes 13 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 2 minutes 02 seconds
Description: Another VQGAN+CLIP scipt. More abstract results from this one.

'a desert oasis in the style of Salvador Dali' VQGAN+CLIP v5
a desert oasis in the style of Salvador Dali

'a hyperrealistic painting of a dragon' VQGAN+CLIP v5
a hyperrealistic painting of a dragon

'Big Bird' VQGAN+CLIP v5
Big Bird

'Cthulhu' VQGAN+CLIP v5
Cthulhu

'Robert DeNiro' VQGAN+CLIP v5
Robert DeNiro

'Yoda' VQGAN+CLIP v5
Yoda “hmmm, abstract I am”


Name: Zoetrope 5.5
Author: Bearsharktopusdev
Original script: https://colab.research.google.com/drive/1LpEbICv1mmta7Qqic1IcRTsRsq7UKRHM
Time for 512×512 on a 3090: 3 minutes 27 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: 256×256 3 minutes 23 seconds
Description: Updated version of Zoetrope 5. Supports more VQGAN models, CLIP models and optimizers compared to Zoetrope 5.

'a cephalopod' Zoetrope 5.5 Text-to-Image
a cephalopod

'a flemish baroque of a demon' Zoetrope 5.5 Text-to-Image
a flemish baroque of a demon

'a photo of a submarine in the style of Vincent van Gogh' Zoetrope 5.5 Text-to-Image
a photo of a submarine in the style of Vincent van Gogh

'a snail' Zoetrope 5.5 Text-to-Image
a snail

'Cthulhu' Zoetrope 5.5 Text-to-Image
Cthulhu

'flesh' Zoetrope 5.5 Text-to-Image
flesh


Name: Zeta Quantize
Author: afiaka87
Original script: https://colab.research.google.com/gist/afiaka87/a97cca3b54c02209b94ff805224f9eb5/zeta_quantize.ipynb
Time for 512×512 on a 3090: 4 minutes 18 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: 256×256 5 minutes 01 seconds
Description: Another VQGAN+CLIP scipt.

'a cute creature made of silver' Zeta Quantize
a cute creature made of silver

'a detailed painting of a cephalopod' Zeta Quantize
a detailed painting of a cephalopod

'a detailed painting of a ghost' Zeta Quantize
a detailed painting of a ghost

'a forest fire made of copper' Zeta Quantize
a forest fire made of copper

'a peacock' Zeta Quantize
a peacock

'a sketch of a Pokemon character in the style of Odilon Redon' Zeta Quantize
a sketch of a Pokemon character in the style of Odilon Redon

'a watercolor painting of dense woodland' Zeta Quantize
a watercolor painting of dense woodland


Name: Experimental VQGAN
Author: Various
Original script: https://colab.research.google.com/drive/1jx3klUxlGbYUwvtqzC9SYl4XZKHL3R81
Time for 512×512 on a 3090: 1 minutes 12 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: 256×256 0 minutes 52 seconds
Description: Very nice smooth results from this one.

'a desert oasis in the style of Craig Mullins' Experimental VQGAN
a desert oasis in the style of Craig Mullins

'a dragon' Experimental VQGAN
a dragon

'a manga drawing of a happy alien' Experimental VQGAN
a manga drawing of a happy alien

'a nightmare' Experimental VQGAN
a nightmare

'a surrealist painting of love' Experimental VQGAN
a surrealist painting of love

'a watercolor painting of a lighthouse' Experimental VQGAN
a watercolor painting of a lighthouse

'an airbrush painting of a well kept garden by Piet Mondiran' Experimental VQGAN
an airbrush painting of a well kept garden by Piet Mondiran

'Cookie Monster' Experimental VQGAN
Cookie Monster


Name: SlideShowVisions
Author: Active Galaxy
Original script: https://colab.research.google.com/drive/1IihC4ZJvCh_tOgBVd900BzHX-ulPEFsa
Time for 512×512 on a 3090: 2 minutes 25 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: 128×128 1 minute 56 seconds
Description: Tends to give more abstract paper cutout looks.

'a happy child' SlideShowVisions
a happy child

'a house vivid colors' SlideShowVisions
a house vivid colors

'a sea monster' SlideShowVisions
a sea monster

'a thunder storm' SlideShowVisions
a thunder storm

'a tree' SlideShowVisions
a tree

'a woodcut of war' SlideShowVisions
a woodcut of war

'an engraving of zombies' SlideShowVisions
an engraving of zombies

'Han Solo' SlideShowVisions
Han Solo


Name: Quick CLIP Guided Diffusion
Author: Daniel Russell
Original script: https://colab.research.google.com/drive/1FuOobQOmDJuG7rGsMWfQa883A9r4HxEO
Time for 512×512 on a 3090: 43 seconds
Maximum resolution on a 24 GB 3090: 512×512
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Modified version of CLIP Guided Diffusion that gets results quicker. Option for 256×256 or 512×512 sized images. Still very hit and miss when getting images that resemble the input prompt. The following samples came from a large overnight batch run of random prompts.

'a cathedral' Quick CLIP Guided Diffusion
a cathedral

'a digital painting of a space nebula' Quick CLIP Guided Diffusion
a digital painting of a space nebula

'a lounge room' Quick CLIP Guided Diffusion
a lounge room

'a monkey | lens flare' Quick CLIP Guided Diffusion
a monkey | lens flare

'a nightmare creature' Quick CLIP Guided Diffusion
a nightmare creature

'a rough seascape' Quick CLIP Guided Diffusion
a rough seascape

'a landscape' Quick CLIP Guided Diffusion
a landscape

'an android' Quick CLIP Guided Diffusion
an android

'an attractive woman' Quick CLIP Guided Diffusion
an attractive woman

'an oil on canvas painting of a cloudy sunset' Quick CLIP Guided Diffusion
an oil on canvas painting of a cloudy sunset


Name: CLIP Guided Diffusion v5
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/1QBsaDAZv8np29FPbvjffbE1eytoJcsgA
Time for 512×512 on a 3090: 3 minutes 48 seconds
Maximum resolution on a 24 GB 3090: Locked to 512×512
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Another CLIP Guided Diffusion script. Locked to 512×512 resolution. Needs less VRAM than the previous versions. The following samples came hand picked from a large batch run of random prompt phrases.

'a cityscape' CLIP Guided Diffusion v5 Text-to-Image
a cityscape

'a gorilla' CLIP Guided Diffusion v5 Text-to-Image
a gorilla

'Cthulhu by Craig Mullins' CLIP Guided Diffusion v5 Text-to-Image
Cthulhu by Craig Mullins

'computer rendering of Emporer Palpatine made of cheese by Evan Charlton' CLIP Guided Diffusion v5 Text-to-Image
computer rendering of Emporer Palpatine made of cheese by Evan Charlton

'digital art of a mountainscape as created by Persis Goodale Thurston Taylor' CLIP Guided Diffusion v5 Text-to-Image
digital art of a mountainscape as created by Persis Goodale Thurston Taylor

'a digital rendering of Chewbacca' CLIP Guided Diffusion v5 Text-to-Image
a digital rendering of Chewbacca

'an ugly person' CLIP Guided Diffusion v5 Text-to-Image
an ugly person

See this tweet for an example of using CLIP Guided Diffusion to stylize a portrait.


Name: MSE Regulized Modified
Author: jbusted
Original script: https://colab.research.google.com/drive/1gFn9u3oPOgsNzJWEFmdK-N9h_y65b8fj
Time for 512×512 on a 3090: 3 minutes 02 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: 256×256 2 minutes 45 seconds
Description: Modified and updated version of the previous “MSE Regulized VQGAN+CLIP” script. Less likely to suffer the previous script’s issue of subjects floating in a purple void.

'a bronze sculpture of a planet' MSE Regulized Modified Text-to-Image
a bronze sculpture of a planet

'a cave by Asher Brown Durand' MSE Regulized Modified Text-to-Image
a cave by Asher Brown Durand

'a charcoal drawing of Emporer Palpatine' MSE Regulized Modified Text-to-Image
a charcoal drawing of Emporer Palpatine

'a cozy den' MSE Regulized Modified Text-to-Image
a cozy den

'a detailed drawing of a heart made of string by William MacTaggart' MSE Regulized Modified Text-to-Image
a detailed drawing of a heart made of string by William MacTaggart

'a digital rendering of Arnold Schwarzenegger made of metal by Muriel Brandt' MSE Regulized Modified Text-to-Image
a digital rendering of Arnold Schwarzenegger made of metal by Muriel Brandt

'a lounge room' MSE Regulized Modified Text-to-Image
a lounge room

'a palace by Jules Joseph Lefebvre' MSE Regulized Modified Text-to-Image
a palace by Jules Joseph Lefebvre

'an oil on canvas painting of a lush rainforest' MSE Regulized Modified Text-to-Image
an oil on canvas painting of a lush rainforest

'an oil on canvas painting of Cookie Monster' MSE Regulized Modified Text-to-Image
an oil on canvas painting of Cookie Monster


Name: Pixray
Author: dribnet
Original script: https://colab.research.google.com/github/dribnet/clipit/blob/master/demos/Start_Here.ipynb
Time for 512×512 on a 3090: 1 minutes 44 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×720
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Updated version of the previous “CLIPIT” script.

'a bronze sculpture of a nightmare creature' Pixray Text-to-Image
a bronze sculpture of a nightmare creature

'a fire breathing dragon by Jan Baptist Weenix' Pixray Text-to-Image
a fire breathing dragon by Jan Baptist Weenix

'a morning landscape' Pixray Text-to-Image
a morning landscape

'a surrealist sculpture of an elephant' Pixray Text-to-Image
a surrealist sculpture of an elephant

'a watercolor painting of an astronaut' Pixray Text-to-Image
a watercolor painting of an astronaut

'an oil painting of a worried woman | Rendered in Cinema4D' Pixray Text-to-Image
an oil painting of a worried woman | Rendered in Cinema4D

'an ugly creature' Pixray Text-to-Image
an ugly creature

'Dracula' Pixray Text-to-Image
Dracula

'Frankenstein' Pixray Text-to-Image
Frankenstein

'vector art of a forest clearing' Pixray Text-to-Image
vector art of a forest clearing


Name: CLIP Guided Diffusion v6
Author: Dango233
Original script: https://colab.research.google.com/drive/14xBm1aSxQLbq26-jmDJi8I1HJ4ti5ybt
Time for 512×512 on a 3090: 3 minutes 10 seconds
Maximum resolution on a 24 GB 3090: Locked to 512×512
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Latest CLIP Guided Diffusion script. The best one yet. Capable of some very nice results.

'a hyperrealistic painting of a human' CLIP Guided Diffusion v6 Text-to-Image
a hyperrealistic painting of a human

'a sketch of planets' CLIP Guided Diffusion v6 Text-to-Image
a sketch of planets

'a storybook illustration of a cloudy sunset' CLIP Guided Diffusion v6 Text-to-Image
a storybook illustration of a cloudy sunset

'a wizard | vivid colors' CLIP Guided Diffusion v6 Text-to-Image
a wizard | vivid colors

'an art deco sculpture of a planet' CLIP Guided Diffusion v6 Text-to-Image
an art deco sculpture of a planet

'an attractive man by John Linnell' CLIP Guided Diffusion v6 Text-to-Image
an attractive man by John Linnell

'an oil on canvas painting of satan' CLIP Guided Diffusion v6 Text-to-Image
an oil on canvas painting of satan

'an oil painting of a clown' CLIP Guided Diffusion v6 Text-to-Image
an oil painting of a clown

'digital art of an ugly person by Avigdor Arikha' CLIP Guided Diffusion v6 Text-to-Image
digital art of an ugly person by Avigdor Arikha

'princess in sanctuary trending on artstation photorealistic portrait of a young princess' CLIP Guided Diffusion v6 Text-to-Image
princess in sanctuary trending on artstation photorealistic portrait of a young princess


Name: CLIPDraw
Author: Kevin Frans
Original script: https://colab.research.google.com/github/kvfrans/clipdraw/blob/main/clipdraw.ipynb
Time for 512×512 on a 3090: 7 minutes 10 seconds
Maximum resolution on a 24 GB 3090: Huge. 4096×4096 and beyond.
Maximum resolution on an 8GB 2080: 1024×1024
Description: Generates images by a series of lines. Very abstract results.

'a cloudy sunset' CLIPDraw Text-to-Image
a cloudy sunset

'a digital painting of a rose' CLIPDraw Text-to-Image
a digital painting of a rose

'a sad clown' CLIPDraw Text-to-Image
a sad clown

'an abstract painting of Yoda' CLIPDraw Text-to-Image
an abstract painting of Yoda

'an etching of a library' CLIPDraw Text-to-Image
an etching of a library

'The Sydney Harbour Bridge' CLIPDraw Text-to-Image
The Sydney Harbour Bridge



Any Others I Missed?

Do you know of any other colabs and/or github Text-to-Image systems I have missed? Let me know and I will see if I can convert them to work with Visions of Chaos for a future release. If you know of any public Discords with other colabs being shared let me know too.

Jason.

Text-to-Image Summary – Part 2

This is Part 2. There is also Part 1, Part 3, Part 4, Part 5 and Part 6.

This post continues listing the Text-to-Image scripts included with Visions of Chaos and some example outputs from each script.


Name: VQGAN Gumbel
Author: Eleiber
Original script: https://colab.research.google.com/drive/1tim3xTsZXafK-A2rOUsevckdl4OitIiw
Time for 512×512 on a 3090: 3 minutes 27 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 4 minutes 05 seconds
Description: Variation using the gumbel-8192 model. Results are a bit rougher than others.

'a childs drawing of a space nebula' VQGAN Gumbel Text-to-Image
a childs drawing of a space nebula

'a movie monster in the style of Edvard Munch' VQGAN Gumbel Text-to-Image
a movie monster in the style of Edvard Munch

'a raytraced image of the Amazon Rainforest' VQGAN Gumbel Text-to-Image
a raytraced image of the Amazon Rainforest

'a tropical beach in the style of Polock' VQGAN Gumbel Text-to-Image
a tropical beach in the style of Polock

'digital art of a rose' VQGAN Gumbel Text-to-Image
digital art of a rose


Name: OpenAI DVAE+CLIP
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/10DzGECHlEnL4oeqsN-FWCkIe_sq3wVqt
Time for 512×512 on a 3090: 3 minutes 07 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 2 minutes 20 seconds
Description: Results are very colorful and more abstract. By default it gives more noisy output images but this can be disabled if you prefer.

'a dragon' OpenAI DVAE+CLIP Text-to-Image
a dragon

'a hyperrealistic painting of planets' OpenAI DVAE+CLIP Text-to-Image
a hyperrealistic painting of planets

'a mountain cabin' OpenAI DVAE+CLIP Text-to-Image
a mountain cabin

'a woodcut of a mountain range in the style of Marvel Comics' OpenAI DVAE+CLIP Text-to-Image
a woodcut of a mountain range in the style of Marvel Comics

'an angry person' OpenAI DVAE+CLIP Text-to-Image
an angry person


Name: Aphantasia
Author: Vadim Epstein
Original script: https://github.com/eps696/aphantasia
Time for 512×512 on a 3090: 1 minute 5 seconds
Maximum resolution on a 24 GB 3090: 4096×4096 or 2520×1080
Maximum resolution on an 8GB 2080: 4096×4096 7 minutes 48 seconds
Description: Different and more messy pastel abstract Turneresque output. I spent a few hours trying many different combinations of settings trying to get the output more coherent and deeper colors. The following samples are as good as I could push it. I give up for now. If you can do better let me know. It does support creating larger 1280×720 resolution images on a 3090 GPU.

'a marble sculpture of a computer' Aphantasia Text-to-Image
a marble sculpture of a computer

'an eyeball' Aphantasia Text-to-Image
an eyeball

'an octopus' Aphantasia Text-to-Image
an octopus

'digital art of frogs in the style of Dr Seuss' Aphantasia Text-to-Image
digital art of frogs in the style of Dr Seuss

'medusa' Aphantasia Text-to-Image
medusa


Name: Text2Image VQGAN
Author: Vadim Epstein
Original script: https://colab.research.google.com/github/eps696/aphantasia/blob/master/CLIP_VQGAN.ipynb
Time for 512×512 on a 3090: 2 minutes 8 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 2 minutes 15 seconds
Description: Allows larger sized 480p images (854×480) on a 3090 GPU.

'a digital painting of the Las Vegas strip' Text2Image VQGAN Text-to-Image
a digital painting of the Las Vegas strip

'a midnineteenth century engraving of a cute monster' Text2Image VQGAN Text-to-Image
a midnineteenth century engraving of a cute monster

'a skeleton' Text2Image VQGAN Text-to-Image
a skeleton

'an ultrafine detailed painting of a crying person' Text2Image VQGAN Text-to-Image
an ultrafine detailed painting of a crying person

'puppies' Text2Image VQGAN Text-to-Image
puppies


Name: MSE VQGAN+CLIP z+quantize
Author: jbusted
Original script: https://colab.research.google.com/drive/1gFn9u3oPOgsNzJWEFmdK-N9h_y65b8fj
Time for 512×512 on a 3090: 6 minutes 19 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 3 minutes 36 seconds
Description: Awesome crisp results. Allows larger sized 480p images (854×480) on a 3090 GPU. One of the best scripts in this list worth exploring.

'a charcoal drawing of a country town' MSE VQGAN+CLIP z+quantize Text-to-Image
a charcoal drawing of a country town

'a hyperrealistic painting of an ugly creature' MSE VQGAN+CLIP z+quantize Text-to-Image
a hyperrealistic painting of an ugly creature

'a landscape made of mist' MSE VQGAN+CLIP z+quantize Text-to-Image
a landscape made of mist

'a mosaic of christmas' MSE VQGAN+CLIP z+quantize Text-to-Image
a mosaic of christmas

'an octopus in the style of Vincent van Gogh' MSE VQGAN+CLIP z+quantize Text-to-Image
an octopus in the style of Vincent van Gogh

MSE VQGAN+CLIP z+quantize allows specifying an image as the input starting point. If you take the output and repeatedly use it as the input with some minor image stretching each frame you can get a movie zooming into the Text-to-Image output. No blending of frames or optical flow for this one, just straight combining of the 854×480 resolution frames into a movie. The VQGAN model was “vqgan_imagenet_f16_16384” and the CLIP model was “ViT-B/32”. The prompts for this movie were “hyperrealistic homer simpson”, “hyperrealistic marge simpson”, “hyperrealistic bart simpson”, “hyperrealistic lisa simpson” and “hyperrealistic maggie simpson”. The original 480p upload was badly compressed and looked terrible after YouTube compressed it, so I upscaled the 480p to 2160p (4K) in DaVinci Resolve and reuploaded to YouTube. This caused their compression to do a better encoding job so the movie is now watchable.

This next example is how MSE VQGAN+CLIP z+quantize interprets various common human phobias. Text prompts were “a hyperrealistic painting depicting acrophobia” etc. To try and smooth out the “flickering” when zooming I started using ImageMagick for zooming. ImageMagick allows sub pixel image resizing options. This movie was also originally 480p and upsized to 4K in Davinci Resolve before uploading.

I have also added some basic scripting (as in automating a series of steps rather than a Python py script) support to Visions of Chaos. Scripting allows the prompt, zoom speed, rotation and panning to be changed during the movie with smooth interpolations between them each frame.

Text-to-Image Script GUI

The following video is a test of the scripting. This video is a Powers of Ten homage with zooming in from the largest scales to the smallest scales.

Another recent addition is the ability to use a series of images as “seed images” that are processed one at a time and then combined into a movie. The following GIF of the Alien chestburster scene is an example of this. The Text-to-Image prompt was “impasto oil painting”.

This next example movie is showing a “Self-Driven” zoom movie. As in a regular zoom movie the output frames are slightly stretched and fed back into the system each frame. The self-driven difference with this movie is that the Text-to-Image prompt text is automatically changed every 2 seconds by CLIP detecting what it “sees” in the current frame. This way the movie subjects are automatically changed and steered in new directions in a totally automated way. There is no human control except me setting the initial “Rainbow colored blobs” prompt. After that it was fully automated.

By default the CLIP Image Captioning script is very good at detecting what is in an image. Using the default accuracy resulted in a zoom movie that got stuck with a single topic or subject. One got stuck on a slight variation of a prompt dealing with kites, so as the zoom movie went deeper it only showed kites. Luckily after tweaking and decreasing the accuracy of the CLIP captioning the predicitons allow the resulting subjects to drift to new topics during the movie.


Name: Monster Maker
Author: P_Hoep
Original script: https://colab.research.google.com/drive/1ZbLnt5fLS_BDfpQY-9Dh_T40pLjfqSAC
Time for 512×512 on a 3090: 2 minutes 01 seconds
Description: No longer available. I was contacted by the author who does not want it shared publicly. The colab link no longer works.

'a black and white photo of a library in the style of Rembrandt' Monster Maker Text-to-Image
a black and white photo of a library in the style of Rembrandt

'a forest fire' Monster Maker Text-to-Image
a forest fire

'a forest path' Monster Maker Text-to-Image
a forest path

'a heart made of feathers' Monster Maker Text-to-Image
a heart made of feathers

'a surrealist painting of the Las Vegas strip' Monster Maker Text-to-Image
a surrealist painting of the Las Vegas strip


Name: CLIP Guided Diffusion
Author: Katherine Crowson
Original script: https://colab.research.google.com/drive/12a_Wrfi2_gwwAuN3VvMTwVMz9TfqctNj
Time for 256×256 on a 3090: 1 minutes 35 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: This one gives very unique results compared to the other scripts. Locked to 256×256 resolution. Some of the results can be very detailed and interesting, but a lot of time it is hit and miss to get a result that reliably matches the input phrase. The following samples came hand picked from a large batch run of random phrases.

'a clown' CLIP Guided Diffusion Text-to-Image
a clown

'a hyperrealistic painting of a witch' CLIP Guided Diffusion Text-to-Image
a hyperrealistic painting of a witch

'a sea monster' CLIP Guided Diffusion Text-to-Image
a sea monster

'a surrealist sculpture of an android' CLIP Guided Diffusion Text-to-Image
a surrealist sculpture of an android

'Brad Pitt' CLIP Guided Diffusion Text-to-Image
Brad Pitt

'New York City' CLIP Guided Diffusion Text-to-Image
New York City


Name: CLIP Guided Diffusion v2
Author: afiaka87
Original script: https://colab.research.google.com/github/afiaka87/clip-guided-diffusion/blob/main/colab_clip_guided_diff_hq.ipynb
Time for 256×256 on a 3090: 2 minutes 38 seconds
Maximum resolution on a 24 GB 3090: Locked to 256×256
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
escription: Modified CLIP Guided Diffusion with more options. This one gives very unique results compared to the other scripts. Locked to 256×256 resolution. Hopefully larger resolution versions of this script will appear in the future. Some of the results can be very detailed and interesting, but a lot of time it is hit and miss to get a result that reliably matches the input phrase. The following samples came hand picked from a large batch run of random phrases.

'a digital painting of a crying person' CLIP Guided Diffusion v2 Text-to-Image
a digital painting of a crying person

'a fine art painting of heaven in the style of Edvard Munch' CLIP Guided Diffusion Text-to-Image
a fine art painting of heaven in the style of Edvard Munch

'a flemish baroque of an angry person' CLIP Guided Diffusion v2 Text-to-Image
a flemish baroque of an angry person

'a flemish baroque of hell' CLIP Guided Diffusion v2 Text-to-Image
a flemish baroque of hell

'a surrealist painting of a witch' CLIP Guided Diffusion v2 vText-to-Image
a surrealist painting of a witch

'the australian outback' CLIP Guided Diffusion v2 Text-to-Image
the australian outback


Name: CLIPRGB
Author: Jonathan Whitaker
Original script: https://colab.research.google.com/drive/1MiKaFFgau6V5QhIed5tpNdLUiSbof4nI
Time for 512×512 on a 3090: 4 minutes 51 seconds
Maximum resolution on a 24 GB 3090: 4096×4096
Maximum resolution on an 8GB 2080: 4096×4096
Description: Very early 0.1 version shows a lot of potential. Can render huge resolution images up to 4096×4096 on a 3090 so I am really looking forward to future versions of this code with sharper details.

'a digital painting of a wizard' CLIPRGB
a digital painting of a wizard

'a forest path' CLIPRGB
a forest path

'a tattoo of planets' CLIPRGB
a tattoo of planets

'a vampire' CLIPRGB
a vampire


Name: CLIP Guided Diffusion v3
Author: Michael Friesen
Original script: https://colab.research.google.com/drive/1Fl2SZvLv23MVSAHxkoiNdxPeAZwibvu1
Time for 512×512 on a 3090: 2 minutes 23 seconds
Maximum resolution on a 24 GB 3090: Locked to 512×512
Maximum resolution on an 8GB 2080: Unable to run on 8GB VRAM
Description: Modified CLIP Guided Diffusion that generates larger 512×512 images. Some of the results can be very detailed and interesting, but a lot of time it is hit and miss to get a result that reliably matches the input phrase. The following samples came hand picked from a large batch run of random phrases.

'a cubist painting of a castle' CLIP Guided Diffusion v2 Text-to-Image
a cubist painting of a castle

'a human made of vines' CLIP Guided Diffusion Text-to-Image
a human made of vines

'a rough seascape' CLIP Guided Diffusion v2 Text-to-Image
a rough seascape

'frogs' CLIP Guided Diffusion v2 Text-to-Image
frogs

'h r giger' CLIP Guided Diffusion v2 Text-to-Image
h r giger

'a matte painting of a landscape' CLIP Guided Diffusion v2 Text-to-Image
a matte painting of a landscape


Name: Zoetrope 5
Author: Bearsharktopusdev
Original script: https://colab.research.google.com/drive/1LpEbICv1mmta7Qqic1IcRTsRsq7UKRHM
Time for 512×512 on a 3090: 2 minutes 36 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1280×720
Maximum resolution on an 8GB 2080: 256×256 2 minutes 09 seconds
Description: Nice crisp results. Can generates up to 720p (1280×720) resolution images on a 3090. Includes a lot of new ideas from multiple people to help improve the outputs.

'a detailed painting of a Pixar character' Zoetrope 5 Text-to-Image
a detailed painting of a Pixar character

'a futuristic city' Zoetrope 5 Text-to-Image
a futuristic city

'a planet' Zoetrope 5 Text-to-Image
a planet

'a surrealist sculpture of a sea monster' Zoetrope 5 Text-to-Image
a surrealist sculpture of a sea monster

'an art deco scultpture of a policeman' Zoetrope 5 Text-to-Image
an art deco scultpture of a policeman

'cyberpunk art of a forest fire in the style of Edvard Munch' Zoetrope 5 Text-to-Image
cyberpunk art of a forest fire in the style of Edvard Munch


Name: CLIP RGB Optimization
Author: hotgrits
Original script: https://cdn.discordapp.com/attachments/730484623028519072/871624258260987934/CLIP__RGB_Optimization_v0_3.ipynb
Time for 512×512 on a 3090: 2 minutes 50 seconds
Maximum resolution on a 24 GB 3090: 4096×4096
Maximum resolution on an 8GB 2080: 4096×4096
Description: Another CLIP RGB based script without the pixelated artefacts of the CLIPRGB script. Can render huge resolution images up to 4096×4096 on a 3090. This script gives more impressionistic textures. By default the output was a bit too dark for my liking so I have added options to tweak the gamma and contrast of the output images in the script. The gamma and contrast tweaks are only at the display stage and do not change the internal image being generated.

'a babbling brook' CLIP RGB Optimization
a babbling brook

'a movie monster' CLIP RGB Optimization
a movie monster

'an amusement park' CLIP RGB Optimization
an amusement park

'Chewbacca' CLIP RGB Optimization
Chewbacca

'Freddy Kruger in the style of Rembrandt' CLIP RGB Optimization
Freddy Kruger in the style of Rembrandt


Name: MSE Regulized VQGAN+CLIP
Author: jbusted
Original script: https://colab.research.google.com/drive/1hf1seGOZctOJUznkhJNblLluXHbWLKZh
Time for 512×512 on a 3090: 3 minutes 16 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 128×128 2 minutes 30 seconds
Description: Generates good images but they tend to be inside a grey/purple border void.

'a bronze sculpture of a heart' MSE Regulized VQGAN+CLIP
a bronze sculpture of a heart

'a cubist painting of Buzz Lightyear' MSE Regulized VQGAN+CLIP
a cubist painting of Buzz Lightyear

'a house made of string' MSE Regulized VQGAN+CLIP
a house made of string

'an art deco sculpture of a vampire' MSE Regulized VQGAN+CLIP
an art deco sculpture of a vampire

'chalk art of C-3PO' MSE Regulized VQGAN+CLIP
chalk art of C-3PO


Name: Sequential VQGAN+CLIP
Author: Jakeukalane and Avengium
Original script: https://colab.research.google.com/drive/1CcibxlLDng2yzcjLwwwSADRcisc1qVCs
Time for 512×512 on a 3090: 1 minutes 41 seconds
Maximum resolution on a 24 GB 3090: 768×768 or 1120×480
Maximum resolution on an 8GB 2080: 256×256 2 minutes 11 seconds
Description: Really nice results and fast.

'a campfire in the style of Vincent van Gogh' Sequential VQGAN+CLIP
a campfire in the style of Vincent van Gogh

'a colorful parrot' Sequential VQGAN+CLIP
a colorful parrot

'a hyperrealistic painting of C-3PO' Sequential VQGAN+CLIP
a hyperrealistic painting of C-3PO

'an impressionist painting of Buzz Lightyear made of paper' Sequential VQGAN+CLIP
an impressionist painting of Buzz Lightyear made of paper

'New York City' Sequential VQGAN+CLIP
New York City


Name: CLIPRGB ImStack
Author: Jonathan Whitaker
Original script: https://colab.research.google.com/drive/1MCC2IwAaRNCTBUzghuG41ypAkxjJvGtq
Time for 512×512 on a 3090: 2 minutes 07 seconds
Maximum resolution on a 24 GB 3090: 2048×2048
Maximum resolution on an 8GB 2080: 512×512 6 minutes 21 seconds
Description: Another CLIP RGB variation. Nice results after some brightness, contrast and sharpness tweaks to the generated images. Could still be a bit sharper.

'a fine art painting of an angry person' CLIPRGB ImStack
a fine art painting of an angry person

'a fireplace in the style of Claude Monet' CLIPRGB ImStack
a fireplace in the style of Claude Monet

'a frog in the style of Beksinski' CLIPRGB ImStack
a frog in the style of Beksinski

'a nightmare creature in the style of H R Giger' CLIPRGB ImStack
a nightmare creature in the style of H R Giger

'a pointalism painting of a vampire made of copper' CLIPRGB ImStack
a pointalism painting of a vampire made of copper


Any Others I Missed?

Do you know of any other colabs and/or github Text-to-Image systems I have missed? Let me know and I will see if I can convert them to work with Visions of Chaos for a future release. If you know of any public Discords with other colabs being shared let me know too.

Jason.

NVIDIA OptiX Denoiser now included with Visions of Chaos

When you render Flame Fractals the pixels accumulate over time. As this process happens different areas accumulate at different rates. Areas that have high hit counts render smoothly while other areas that have lower hit counts render more noisy/speckled.

While I was recently revisiting the Fractal Flame mode in Visions of Chaos I experimented with some ideas I had to make a “smart blur” function. This was going to supposedly blur pixels by different amounts depending on their neighbors. So areas with more noise would be blurred more than areas with little difference between pixels. After some experimenting with various convolution kernels etc I gave up. My results were either too blurry, too dark or other issues.

Searching for other methods got me to “denoising”.

Note: all of the example images in this post need to be seen full 4K size to notice the differences.

NVIDIA OptiX

NVIDIA OptiX is a raytracing engine. Part of what it can do is denoising.

Denoising is a big deal in ray tracing areas. When raytracing images (depending on the method used) you need to make many passes of the image before the details are smoothed out.

Here is an example from Visions of Chaos (the “Pathtracing Global Illumination” example shader) after running for a few seconds.

Denoiser

You can see a lot of noise visible as the rendering was stopped way before the pathtracing converged to a smooth image.

Running the image through the OptiX denoiser gives the following result.

Denoiser

A much smoother result.

The alternative is to allow the pathtracer to run for much much longer until it smooths itself out. The next image took around 10 minutes to accumulate the smoothness that OptiX took a second to smooth. If you look closely the details are better in this image compared to the denoised image.

Denoiser

In some cases the noise may be desirable. If like me you prefer the film grain of a good 70’s movie to the smooth digital look of a modern film then a bit of noise may be more aesthetically pleasing to you, but in general less noise is desirable in rendered images.

Accumulation Fractals

For Visions of Chaos my interest in denoising was for modes like Buddhabrots, Flame Fractals, Iterated Function Systems, and Strange Attractor images. All the modes that have images that are built up by the accumulation of points. A side effect of this accumulation is that areas of the image less hit show up noisier than areas hit more frequently.

Denoising to the rescue!

When It Works

Here are some examples of how the denoiser can help. It seems to work best (at least from my initial experiments) with images that have whispy details.

Iterated Function System with noisy areas.

Denoiser

The same image processed with OptiX.

Denoiser

Flame Fractal with noisy areas.

Denoiser

The same image processed with OptiX.

Denoiser

Strange Attractor with noisy areas.

Denoiser

The same image processed with OptiX.

Denoiser

When It Doesn’t Work

Denoising is not a magic bullet to fix any pixelated image. It was specifically trained to denoise noisy raytraced images.

Here are a few example images that it did not help. When the denoising “fails” it tends to smear areas of the image rather than enhancing them.

Buddhabrot Fractal. Surprisingly Buddhabrots do not seem to be a good candidate for OptiX denoising. The noisy areas do get denoised, but areas with details tend to get smeared more.

Buddhabrot before.

Denoiser

Buddhabrot after OptiX denoising.

Denoiser

Multi-Scale Turing Pattern before.

Denoiser

Multi-Scale Turing Pattern after.

Denoiser

Magnetic Pendulum before. This is a good example of how denoising needs noise to work. This image has a lot of fine detail, but nothing that could be classified as noise.

Denoiser

Magnetic Pendulum after.

Denoiser

Availability

Denoising is now included with Visions of Chaos. I am including the excellent command line version from Declan Russell. Denoising any image created in Visions of Chaos is now just a click away. If you want to denoise other images outside Visions of Chaos I highly recommend Declan’s implementation.

Jason.

Dendritic Crystal Growth

Dendrites are the multi-branched fractal like patterns that can grow in crystals or metals. Snowflakes are also a good example dendritic growth.

Snowflake

After seeing this post on r/Simulations that linked to the paper Numerical Simulation of Dendritic crystal growth using phase field method and investigating the effects of different physical parameter on the growth of the dendrite (great name) which includes some Matlab code I was able to add Dendritic Crystal Growth as a new mode in Visions of Chaos.

Also see the original paper referenced which is Ryo Kobayashi’s “Modeling and numerical simulations of dendritic crystal growth“.

Results

Here are some results from the new mode. I have also added an option to give the growths a 3D embossed look.

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Dendritic Crystal Growth

Here is a sample movie

Bug fix for code

At first my translation of the Matlab code in the paper did not result in anything but boring smooth edged blobs. I don’t own a copy of Matlab to verify the code, but I assumed (as any programmer does when their code doesn’t work) that it must be something I am doing wrong. But after some time checking and double-checking I did find some problems with the code in the paper.

Firstly this line near the end


tnew(i,j) =t(i,j) + lap _t(i,j)*dt + k*(phi(i,j) -phiold);

needs to be changed to


tnew(i,j) =t(i,j) + lap _t(i,j)*dt + k*(phinew(i,j) -phiold);

otherwise the phi-phiold cancels out and k is multiplied by zero.

Secondly the calculations for grad_epsilon_x and grad_epsilon_y


grad_epsilon2_x = (epsilon(ip,j)^2 -epsilon(im,j)^2)/dx;
grad_epsilon2_y = (epsilon(i,jp)^2 -epsilon(i,jm)^2)/dy;

should be moved to the second loop before the term1 calculation.

With those two simple fixes my code was generating the correct dendritic structures and created the images in this post.

Availability

Dendritic Crystal Growth and two Snowflake modes are now included under the new Dendritic Growth mode menu in Visions of Chaos.

Jason.

Combinations Cellular Automata

Combinations Cellular Automaton

This is a new idea for a 1D cellular automata that came from Asher (blog YouTube).

Combinations Cellular Automaton

Trying to explain this clearly is harder work than programming it.

Combinations Cellular Automaton

Rule string

Combinations CAs can have between 2 and 10 maximum states per rule. 2 states means the cells can be either on or off. 3 states mean cells have 1 dead state and 2 living states.

The CAs are governed by a rule string.

The rule string needs to be states^2 characters in length. For 2 state rules the rule string will need 4 characters, for 4 states the rule string needs 16 characters, for 10 states the rule string needs 100 characters.

For 2 states the rule string characters are in base 2, so binary values of either 1 or 0. For 3 states the rule string is base 3, so the characters are all between 0 and 2. For 10 states the rule string is base 10, so the characters are all between 0 and 9.

An example 2 state rule is 0110 which creates the following;

An example 3 state rule could be 012120200 that creates the following result;

An example 10 state rule could be

50391696401156117866581414266160495600057647563383
52633979845117861158665940485681011955680007489199

which gives this result;

Number of possible rules

When you have a maximum of 2 states, the rule string has 4 digits of either 0 or 1. The maximum rule string is 1111, which converts to 15 in binary. That gives you 16 possible rules (0000 is also counted) for state 2.

Increasing to 3 states has a 9 digit rule string with base 3 digits. 222222222 is the maximum base 3 number which translates to 19682, so there are 19683 possible 3 state CA rules.

4 states is 16 characters. 3333333333333333 is 4,294,967,295 in decimal, so 4,294,967,296 possible 4 state rules.

10 states is 100 characters. No need for decimal conversion as base 10 is decimal. So 1 with 100 zeros after it possible rules. That may as well be an infinite space to look for rules within.

Combinations Cellular Automaton

How the cells are updated

Each cell in the CA is updated by taking into account 2 cells from the previous step. These 2 cells are converted into a decimal value that is used as an index in a rule string to give the new cell state. I bet that makes loads of sense! Let me try and explain.

An example with a 2 state CA using rule 0110, so cells can only be either state 0 or state 1.

If the first step is a single center cell, then the values would look like;


00000100000
?

Cell at ? looks at the cell above it and the cell next to it. These are 00. 00 converted to decimal is 0. So we look at the 0th entry in the rule string, so the cell value becomes 0.


00000100000
0

Same for the next 3 cells


00000100000
0000

Then we get to the next ? cell


00000100000
0000?

This cell has the cell above it and next to it as 01. 01 converts to decimal 1, so we look at the 1th (second) digit in the rule string. The second digit in the rule is a 1, so the new cell becomes a 1.


00000100000
00001?

The next ? cell now has 10 above it. This converts to a 2 in decimal, so we look at the third rule digit. Again it is a 1.


00000100000
000011

The rest of the row has 00 above them, so they get set to 0.


00000100000
00001100000

If you continue this process, you get the following result;

Brick wall neighborhood

This CA also uses the brick wall neighborhood idea for selecting neighborhoods. See here for an animation of brick wall. Basically you shift the neighbor locations being checked every other line.

For the Combinations Cellular Automata in this post this means the location of the 2 neighbor cells changes every line.

For even lines you use the cell directly above and to the right. For example, in this next diagram the ? cell would use the 3 and 4 locations.


12345
  ?

For odd lines you use the cell to the left and the cell above. This would be the 2 and 3 neighbor cells.

Combinations Cellular Automaton

Interesting Rules

Rule 002200121 behaves similar to Wolfram’s Rule 30. During bulk runs of rules I have seen other results that also look like this one.

Combinations Cellular Automaton

Another result that has been observed to occur are these “counting triangles” rules. These show similar behavior to Wolfram’s “Rule 225“. The splitting of bifurcation like structure on the right hand side is like a binary counter. This is rule 010221200.

Combinations Cellular Automaton

Rule 200122011 also gives a binary counter structure.

Combinations Cellular Automaton

More example images

See my flickr gallery for more example results.

Combinations Cellular Automaton

Availability

Combinations Cellular Automata are now included in Visions of Chaos.

Jason.

Comments Off on Combinations Cellular Automata Posted in Uncategorized

Visions of Chaos movie tutorials

For years now I have been asked for some more in depth tutorials on Visions of Chaos. One recent comment on my YouTube said a movie watching me use it may help them.

I have wanted to do movie tutorials but I was never happy with my voice. After doing these first 3 parts I have nothing but the utmost respect for anyone that can do this easily. Even with a full script pre-written of what I want to say it is not easy.

I used the free OBS Studio to capture my screen and then the Davinci Resolve free version for editing. Davinci has been great for chopping out all my seemingly endless umms, errs, pauses, stutters, flubbed lines, etc.

For the first 3 tutorials I used a super cheap headset with a built in microphone so the sound is very rough, but hopefully they can help people and get some feedback on other areas of the program people want tutorials on.

After buying a better microphone I recorded the following TensorFlow tutorials.

There are occasional very low end thumping noises. They were a mystery at first (sounds almost like construction noise, but there was none that I could hear at the time of recording) but turned out to be the lift in my building. Otherwise you may hear cicadas and other Australian wildlife in the background.

Another tutorial about the Physarum Simulations in Visions of Chaos.

Number 6 is a more relaxed non-scripted tutorial on Video Feedback simulation. This one was a bugger to edit.

The 7th tutorial covers how to print Mandelbrot Fractals across multiple pages to make large wall sized posters.

The 8th tutorial covers DeepDream functionality in more depth.

The 9th is all about Style Transfer

I wouldn’t say I am getting good at this yet, but the recording and editing process is getting easier. If you have any specific requests for tutorial topics let me know.

Jason.

Automatic Color Palette Creation

Fractint MAP format palette files

Going back 30 years, Fractint was a fractal generation program for DOS based systems. For its time it was the fractal program of choice for enthusiasts.

Fractint used a simple text format for its color palettes. These *.MAP files were text files with each color’s RGB values separated by spaces each on a new line. So, for example if you wanted the first color in your palette to be blue the first line would be “0 0 255”.

When I first started creating Visions of Chaos I adopted the format. The most common map files had 256 colors (you could have palettes with other color counts but I only use 256 color palettes).

The rest of this post covers the palette creation methods that have been included with Visions of Chaos. Although I use these methods specifically to create 256 color MAP files the principles could be applied to any number of colors for different sized palettes.

If you are just looking for a Fractint color palette collection, scroll down to the end of this post and grab the archive provided.

Smoothly blending colors

Visions of Chaos Color Palette Editor

This is probably the first and most obvious method to use. Take a small number of base colors (I allow up to 16) and blend them into a palette.

How you get the colors to blend can be;

1. User selects them from the standard color picker dialog.
2. User can use eye dropper functionality to pick them out of a photo.
3. Set them at random.
4. Use the color wheel. Allows selection of complmentary colors, tetrads, and other color theory based colors.

Visions of Chaos Color Palette Editor

5. Extract colors from an image. See this previous blog post explaining how that works.

Visions of Chaos Color Palette Editor

Once you have the colors there are numerous ways you can blend them;

1. Smooth blend. Smoothly interpolate the colors.

Visions of Chaos Color Palette Editor

2. Fade out blend. Fade each of the colors to black.

Visions of Chaos Color Palette Editor

3. Fade in blend. Fade each of the colors from black.

Visions of Chaos Color Palette Editor

4. Neon blend. Fade from black to the colors then back to black.

Visions of Chaos Color Palette Editor

5. Stripe blend. Alternate each color for the duration of the palette.

Visions of Chaos Color Palette Editor

Using curves to create palettes

The idea here is to use various mathematical functions to generate curves for the RGB components of the palette. The following is a list of the various methods I use so far.

Sine. Each RGB color component is its own sine wave. Randomize the wave amplitude, frequency and period.

Visions of Chaos Color Palette Editor

Multiple Sine. Add multiple sine waves together for each RGB component and then scale down to between 0 and 255.

Visions of Chaos Color Palette Editor

IQ. Idea from Inigo Quilez.

Visions of Chaos Color Palette Editor

Perlin. Use repeating noise loops as in this coding train video. Map the resulting noise values to each RGB channel. Using a looping noise function is best because it means the palette wraps around smoothly and using it for fractal zooms does not show a sharp break when the palette ends and restarts. I have only implemented this method over the last few days (at the time of writing this post), but so far it gives some really unique color palettes.

Visions of Chaos Color Palette Editor

Here are some examples palettes created using Perlin noise. Click to see the full sized image.

Visions of Chaos Color Palette Editor

Simplex. Same as Perlin, but uses Simplex noise.

Visions of Chaos Color Palette Editor

Simplex + Perlin. Create each RGB value by adding Simplex noise to Perlin noise.

Visions of Chaos Color Palette Editor

Here are some examples of Simplex and Simplex + Perlin palettes. Click for full size.

Visions of Chaos Color Palette Editor

Multiple Perlin – Add/subtract multiple Perlin Noise curves into RGB amounts.

Visions of Chaos Color Palette Editor

Random Walk. Random curve for each RGB component between index 0 and 127. Reverse for the rest of the palette. Each step the RGB is changed by +random(5)-2 to randomly go up and/or down.

Visions of Chaos Color Palette Editor

Terrain Fault. Take 2 random points between 0 and 255. Between the points randomly raise or lower by a small amount. Repeat this a number of times.

Visions of Chaos Color Palette Editor

HSL to RGB. Random HSL curves converted to RGB.

Visions of Chaos Color Palette Editor

RGB. Random curves for each RGB component. Use various easing functions to tween curve control points.

Visions of Chaos Color Palette Editor

YUV to RGB. Random YUV curves converted to RGB.

Visions of Chaos Color Palette Editor

Combine palettes. Take 2 previously created palettes and combine their RGB components by addition, subtraction or multiplication.

Visions of Chaos Color Palette Editor

Multiple RGB. Combine multiple RGB curves.

Visions of Chaos Color Palette Editor

Multiple YUV to RGB. Combine multiple YUV to RGB curves.

Visions of Chaos Color Palette Editor

Modify an existing palette

Once you have palette files, you can also use various techniques to modify them;

1. Increase or decrease the individual RGB channel amounts
2. Brightness
3. Contrast
4. Increase or decrease the individual YUV channel amounts
5. Wrap. Take the existing palette, halve it, then add the flipped half to itself. This is useful when you want a non repeating palette to wrap around.

Visions of Chaos Color Palette Editor

Visions of Chaos Color Palette Editor

6. Double. If you have a palette that is too smooth/sparse for the current fractal image, doubling can add more lines/gradients to the palette.

Visions of Chaos Color Palette Editor

Visions of Chaos Color Palette Editor

7. Blur. Just like a blur function in image processing. Averages out the palette values with neighbor colors.
8. Sharpen. Just like a sharpen function in image processing.
9. Shift RGB. R->G,G->B,B->R.

Visions of Chaos Color Palette Editor

Visions of Chaos Color Palette Editor

Visions of Chaos Color Palette Editor

10. Invert. R=255-R, G=255-G, B=255-B.
11. Reverse. Flip the order of the palette colors.
12. Histogram equalize palette. Like the auto-levels in Photoshop. My method tends to make the results slightly too bright. Needs fixing when I get a chance.

Visions of Chaos Color Palette Editor

Visions of Chaos Color Palette Editor

13. Matrix multiplication. Take a 3×3 matrix and multiply the 1×3 RGB components by the matrix to get new RGB amounts.

Visions of Chaos Color Palette Editor

Any other ideas?

If you know of any other ways to generate palettes, or have an idea for ways to create new unique color palettes, let me know.

Availability

The color palette editor shown in this post is included with Visions of Chaos.

Just give me the palettes!

If you are using another program that uses Fractint palette files you can download the 3371 color palettes I include with Visions of Chaos here. Some created by me, others found on various Internet sites over the years, some converted from gradient packs. No copyright on them so do with them as you wish.

If you do have any other sets of MAP palettes you would like to share, send me an email. You can never have enough colors when creating fractal images.

Jason.

Visions of Chaos now supports Pixar RenderMan

Visions of Chaos can now use Pixar‘s Free Non-Commercial RenderMan to render 3D scenes.

This is the same RenderMan engine used in movies like Finding Dory

Finding Dory Screenshot

and Rogue One

Rogue One Screenshot

I am only using a tiny fraction of RenderMan’s features. Rendering millions of spheres or cubes with nice shading and lighting covers my requirements, but it is good to know that I have the extra power of RenderMan to expand if needed in the future.

All the end user needs to do is download the stand-alone Pixar RenderMan and point Visions of Chaos to the main command line RenderMan prman.exe renderer file. After that Visions of Chaos constructs the RIB format files RenderMan understands and gets prman to render the images.

Currently RenderMan support is added to the following Visions of Chaos modes;

3D Ant Automata

3D Ant Automaton

3D Cyclic Cellular Automata

3D Cyclic Cellular Automaton

3D Cellular Automata

3D Cellular Automaton

3D Hodgepodge Machine

3D Hodgepodge Machine

3D Hodgepodge Machine

3D Voxel Automata Terrain

Voxel Automata Terrain

3D Diffusion-Limited Aggregation

3D Diffusion-Limited Aggregation

3D Cube Divider

Cube Divider

RenderMan seems to be able to handle a massive number of objects so far. This next test 4K image was over 33 million little cubes. RenderMan churned away for over 2 hours on this one image. The main slowdown was my lack of memory and constant hard drive paging. 32 GB of RAM just does not cut it when rendering this sort of data. The RIB scene file alone was 13 GB.

Voxel Automata Terrain

RenderMan is slower than Mitsuba at this stage. But that could be down to any number of reasons including my lack of RenderMan knowledge.

Jason.

Eroding Fractal Terrains with Virtual Raindrops

A long while back I added a very simplistic fractal terrain simulator to Visions of Chaos. I had an idea to try and add erosion simulation into the existing code to get some more realistic terrain shapes.

Generating the inital terrain

There are many ways to generate a terrain height array. For the terrain in this post I am using Perlin noise.

This is the 2D Perlin Noise image…

Fractal Terrain

…that is extruded to the following 3d terrain…

Fractal Terrain

An alternative method is to use 1/f Perlin Noise that creates this type of heightmap…

Fractal Terrain

..and this 3D terrain.

Fractal Terrain

Simulating erosion

Rather than try and replicate some of the much more complex simulators out there for wind and rain erosion (see for example here, here and here) I experimented with the simplest version I could come up with.

1. Take a single virtual rain drop and drop it to a random location on the terrain grid. Keep track of a totalsoil amount which starts at 0 when the drop is first dropped onto the terrain.
2. Look at its immediate 8 neighbors and find the lowest neighbor.
3. If no neighbors are lower deposit the remaining total soil carried and stop. This lead to large spikes as the totalsoil was too much. I since changed the drop rate to the same as the fixed depositrate. Technically this removes soil from the system, but the results are more realistic looking terrain.
4. Pick up a bit of the soil from the current spot (lower the terrain array at this point).

soilamount:=slope*erosionrate;
totalsoil:=totalsoil+soilamount;
heightarray[wx,wy]:=max(heightarray[wx,wy]-soilamount,0);

5. Move to the lowest neighbor point.
6. Deposit a bit of the carried soil at this location.

deposit:=soilamount*depositrate/slope;
heightarray[lx,ly]:=heightarray[lx,ly]+deposit;
totalsoil:=max(totalsoil-deposit,0);

7. Goto 1.

Repeat this for millions of drops.

The erosion and deposit steps (4 and 6 above) simulate the water flowing down hill, picking up and depositing soil as it goes.

To add some wind erosion you can smooth the height array every few thousand rain drops. I use a simple convolution blur every 5000 rain drops. This smooths the terrain out a little bit and can be thought of as wind smoothing the jagged edges of the terrain down a bit.

Erosion Movie

Here is a sample movie of the erosion process. This is a 13,500 frame 4K resolution movie. Each frame covers 10,000 virtual raindrops being dropped. This took days to render. 99% of the time is rendering the mesh with OpenGL. Simulating the raindrops is very fast.

and here are screenshots of every 2000th frame to show the erosion details more clearly.

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Fractal Terrain

Future ideas

The above is really just a quick experiment and I would like to spend more time on the more serious simulations of terrain generation and erosion. I have the book Texturing and Modeling, A Procedural Approach on my bookshelf and it contains many other terrain ideas.

Jason.