Generative AI: Heaven and Hell

A demonic battle in the style of Hieronymus Bosch.

Generative AI is going to change everything.

After some initial fiddling to get the hang of the slightly irritating interface via Discord, I’ve started to dig into the capabilities of the generative AI Midjourney. And my mind is reeling.

This technology is revolutionary. It’s the biggest leap forward since the Internet itself.

While I’ve been following recent developments in AI with keen interest, some initial experiments with Dall-E 2 left me underwhelmed. It was still impressive—it’s an AI creating images from scratch, for God’s sake—but the specific results often seemed far from what I imagined, and Dall-E seemed to balk at creating the kind of gritty and gruesome images I conceived in relation to my own fiction.

Not so Midjourney. However the Midjourney team is training their black box, it’s highly attuned, not only for sci-fi and fantasy images, but to a host of often unexpected uses.

But let’s start with sci-fi and fantasy.

Initially I was trying for very specific images, with mixed results. Like Dall-E, the AI often took my prompts and delivered something that was close but not close enough. Consider the prompt “a horse with black alligator skin and a raptor head, full body”: seems simple enough, right?

Close, but no cigar. Or half a cigar, and half a bird sticking out of your butt. Don’t look a raptor horse in the mouth, I guess.

This kind of thing happens a lot, and no doubt there’s a lot you can do to fix it, but I’m just starting out. Similar experiments were similarly problematic. For a while I kept trying to get it to create “a black horse with shiny snake scales,” which seems direct and intuitive enough; but the system kept creating horses with half scales and half mane, or scales that were far too tiny for my purposes (a reference for a hand-drawn piece I’m working on). On the other hand, a slight change in the prompt—”a horse with black alligator skin, full body”—delivered some perfectly usable images.

Maybe not as natural as I’d like, more plastic than animal, but as in all these images, I’m immensely impressed by the AI’s ability to create textures and model light around volumes—some of the great challenges of drawing.

But this is the tip of the iceberg. Let’s look at “a beautiful and intricate wooden sculpture of kannon bodhisattva, with a beautiful face and flowing robes, against a background of carved wooden leaves and branches, in the style of baroque european altarpieces, in the style of master h.l.”

With a few phrases, the AI conjures works of real beauty, a Baroque Buddhist art tradition that never was, or has yet to be. No, they’re not perfect; but consider the astonishingly graceful way Midjourney renders drapery, the folds cascading, wrapping, flowing, circling around the limbs. As an artist, I know that modeling drapery this way requires enormous skill. The old masters studied for decades to accurately render the way cloth hangs on the body. And now, with a few words, anyone can do it.

Or, more to the point for my own work, I can take these images as templates for compositions. I can keep the sinuous folds, but correct the misshapen hands; I can note the way the light falls on the carved branches, but alter the paths of the branches themselves; copy the texture of the leaves, but adapt the shape to become fig leaves, oak leaves, cherry blossoms.

Midjourney, I concluded, is powerful for anyone—incredibly powerful. It democratizes the power of visualization, makes Baroque masters of us all. But for those who already possess artistic skill, for those who understand composition, light and color, it likewise amplifies their abilities.

Consider also: What’s to stop me from using another program to render the images as 3D models—and then simply print them out on a 3D printer? What’s to stop me from rendering these figures in the real world, as real altarpieces?

Nothing, of course, though I may be underestimating the obstacles. But for a more approachable product, consider these sort-of wooden Dharma wheels, again in a Baroque style:

Or these ivory pendants:

How hard would it be to manufacture these, one way or another—by 3D printing, or by rendering them as vectors and then cutting the shapes out of wood with a laser cutter? Fairly trivial; and more trivial still with an AI generator that creates 3D models directly.

Not just pixels on a screen: real altarpieces, real pendants. Real buildings, cars, computer code, electronic components, music, movies, chemicals, viruses: real everything, jetting from our AI daemons like water from a fire hose, faster than we can possibly absorb.

Speaking of demons, here’s the prompt that really blew my mind: “A demonic battle in the style of Hieronymus Bosch, with science fiction elements.”

These images are stunning. They’re as good (well, nearly) as anything the top fantasy illustrators in the world can produce (perhaps not entirely surprising, since as we all know, the AI was trained on their works). More: they’re wildly creative, wildly evocative. They tell stories, they introduce characters, settings, conflicts. There are obvious similarities between them in terms of style, but the specific images are astonishingly unique. A rhino sorcerer conjures flames in his hand, a vampire lord floats over a battlefield of struggling minotaurs, a spiked ball wanders with its fellows on a strange dusty planet. The AI dips into the ocean of human art, and summons monsters from the abyss.

Just metaphor? Pixels on a screen?

Don’t get me wrong: I’m excited. These tools are immensely well suited to my own work—to illustrate stories I’ve already written, to inspire new stories, to incorporate into drawings and designs. A flood of projects is about to spill the banks.

But in the wider world, I can’t imagine that we’re ready for this pace of change. What rough beast indeed?