The Yoga of Image Generation – Part 3

DEV Community

Image Prompt Adapters (IP Adapters)

Let’s now try another technique: Image Prompt Adaptation, which is more decoupled from the base model. It functions similarly to a ControlNet but alters the model directly. Think of an IP Adapter as a one-image LoRA.

The FaceID IP Adapter, specialized in facial recognition and feature extraction, is a perfect fit for our needs.

FaceDetailer

While exploring facial enhancement tools, I also discovered FaceDetailer, which improves facial features (eyes, nose, lips, expression) after image generation. I decided to integrate both of these components into our workflow. FaceDetailer’s enhancements are based on the FaceID input, so they remain faithful to the original facial reference.

Here is the complete workflow:

Final workflow

We now finally achieve our desired outcome:

Control over style via prompts and embeddings
Control over pose via ControlNets
Control over identity via the FaceID IP Adapter and FaceDetailer

This setup allows us to generate precise and coherent Yoga sequences.

Sequence with FaceID, FaceDetailer and ControlNets

Another advantage of this workflow is how easily we can switch the base model. For instance, here’s an example using the Cheyenne model, which specializes in cartoon and graphic novel styles:

Changing the Base Model

It’s also incredibly easy to change the subject’s identity. Since FaceID only requires a single image and no training phase, here are examples generated with the exact same workflow, using my own face as input for facial identity:

Changing the Persona

This concludes our three-part series. My initial goal — generating accurate yoga poses and full sequences using only a local machine — has been achieved.

In Part 1, we introduced Stable Diffusion and ComfyUI to build simple Text-to-Image workflows using prompts and embeddings. In Part 2, we explored pose transfer using Image-to-Image workflows and ControlNets. In this final installment, we addressed facial consistency, first with LoRAs, then with the FaceID IP Adapter and the post-processing FaceDetailer.

You’re now ready to create custom workflows tailored to your specific visual goals. Enjoy experimenting with generative AI to express your creativity with precision!

Stay tuned for more image generation tutorials and in the meantime, feel free to explore my YouTube channel for more content.

[フレーム]