The incredible development and quality of generative AI models for 2D images have led to general wonderment about how such methods can be applied to 3D. In Prompt Planets, I take a look at the potential of generative AI for 3D content generation, but within the constraints of 2D image generation via Stable Diffusion. Without exploring another learning-based method that can produce 3D models via prompt, I constrain my project to what can mathematically produce 3D models from images at a high quality. As a result, I consider planets, spherical models that can be generated via terrain outputs from Stable Diffusion.
Stable Diffusion 1.5 generated terrains via variations on the prompt '(2d satellite image)...'
With the motivation to produce 3D content with Stable Diffusion, I consider simple ways we might already be able to extract 3D generation from a 2D image. A heightmap is one such method, translating grayscale images into 3D terrain by turning pixel values into vertex heights of a flat mesh plane. Using this, images of terrain generated by Stable Diffusion can be simply converted into 3D in this manner.
Sample input image at desired XY vertex resolution of the plane and displace vertices based on RGB pixel values
I wanted to extend this further and consider how to make a fully enclosed 3D object. I recall the quad sphere, a mesh sphere composed of six sides, just like a cube. Each side is a warped planar mesh to create an enclosed sphere. This also means that a flat planar mesh can have its vertices projected onto a side of the quad sphere and hence means that we can map the generated heightmap per side of a quad sphere. Though particular as a use case, this general concept can help us start imaging a 3D design pipeline with Stable Diffusion.
A typical cube and the quad sphere have the same organization and number of mesh faces
In creating the backdrop for 3D environments, a common method is through using cubemaps, six squares of an unfolded cube that hold a continuous texture, seamless at each edge. Because we are working with a six-sided quad sphere, what we want to generate are cubemaps with the help of generative AI via prompts. It is from these cubemaps we can create a fully 3D, textured, and seamless representation of the generated images.
The cubemap is an unwrapped cubed where each of its six panels is seamless to its adjacent panel
The cubemap rewrapped around the cube still has continuous seams along all of the face edges
Instead of using Stable Diffusion to create all six sides I instead decided to generate one to two images per planet and use the texture of those images to create the rest of the cubemap. Here, I employ a more manual process, utilizing Photoshop and Content-Aware Fill to fill out intermediary squares between the ones given by Stable Diffusion. The reason for going this route is to maintain a consistent textural design for the whole planet and more importantly, create seamless transitions between panels. Furthermore, I use Content-Aware again along the seams of each panel to the next to make sure they look as seamless as possible.
With just two input images in the cubemap, we can fill in the rest of the panels using the textures of the given images
There are a number of ways to modify the cubemaps even further, both with Photoshop and with inpainting via Stable Diffusion. This includes inpainting new geographic features on an otherwise textureless surface or masking out the border of an image and turning a continuous terrain into an island. These 2D image-generation techniques can also help us edit what we want to become 3D.
Interior masked inpainting adding the Stable Diffusion prompt: 2d satellite image of a (white arctic iceberg)
Border masked inpainting adding the Stable Diffusion prompt: (2d satellite image) of an island in the ocean
To convert the cubemaps into 3D spherical planets, within my 3D modeler Rhino and its scripting interface Grasshopper, I created an automated 2D-to-3D pipeline. Out of the box, Grasshopper comes with geometry primitives and high-level mesh functions that are helpful for building out the cubemap processing script. This includes a base mesh quad sphere, which will be used for projecting our heightmaps onto.
Given a planar heightmap, the same transformation can be applied on a face of the quad sphere
The script for processing the cubemaps works by individually processing each of the six square panels. In summary, this is the following procedural order of the script:
Given a resolution parameter, sample a 2D array of pixels in the input image
Calculate the weighted luminosity of the image to convert RGB to grayscale
Each side of the six quad sphere sides has its vertices paired with the corresponding heightmap
Displace the quad sphere vertices by each of its vectors to the center point of the sphere by the magnitude of its corresponding grayscale value from the heightmaps
Overlapping vertices along the seams of the mesh quad sphere are gathered and averaged to a common point to close the seam
Using a NURBs-based quad sphere, apply its UV mapping to the newly constructed mesh quad sphere (important for applying the image as a texture back to the mesh for the final 3D model)
Final output after applying all faces of the cubemap onto the quad sphere faces
Using this script, it can be published to a panel within the 3D modeling editor. This simplifies the 3D generation process to tuning parameters for the final 3D model. Once the model is established, the planet mesh can be exported and later brought into its new solar system!
Script parameters brought into Rhino's modeling UI for ease of generating new planets
Having a way to generate planets, I wanted to create a proper context and interface to explore my generative creations. My plan was to make a full solar system with nine unique planets and a sun at the center of it. Using this pipeline I made these ten planets exploring different color schemes in Stable Diffusion and terrain types to create a vastly varied solar system.
Ten crafted cubemaps with Stable Diffusion's help to be turned into a 3D solar system
3D models created via cubemap inputs
Within Unity, I developed a basic 3D solar system environment and UI , deterministically deciding upon the orbit speeds, planet rotations, size, and order away from the sun. Within the final build, the solar system is traversable like being in a 3D editor, and the planets can be individually selected to reveal their Stable Diffusion prompt, seed, and 2D cubemap. But perhaps more importantly, I was creatively enabled and made my own intuitive design decisions with the given 3D assets due to the generated high-fidelity 3D models and textures that I otherwise would not have been able to achieve easily on my own. Thanks AI!
Scenes from within the Prompt Planets web app
As someone who resides more outside-looking-in with the ongoing developments of generative AI, I think the tools and skills a 3D designer and developer already have, can begin translating 2D into 3D in creative ways with high-quality output. Furthermore, I think this will push the conversation of how we use generative models in new and interesting ways. Even going so far as to learn what prompts are necessary or well optimized in a specific design pipeline creates unique use cases and is discovered for the generative AI model. Very much so, we need to also iterate on interesting applications for these models.
For future work specific to Prompt Planets, I also like the possibility of creating worlds at the input of prompts. I think a lot about games like Outer Wilds or No Man’s Sky that celebrate planetary exploration - why not make it infinite via prompts? Of course that is a long way off from what I had developed but exercising the imagination is important to suggest new avenues for research. I think in AI, there is an attitude of generality right now that perhaps can benefit from building more specific things that might require more domain-specific processes - and that is not saying just augmenting current datasets and/or training. There is a lot to learn about combining knowledge from the ways 3D design knowledge has already currently been shaped.
Explore the web build of Prompt Planets yourself here!