[フレーム]

ai-forever/kandinsky-2.2

multilingual text2image latent diffusion model

Public
10M runs
Playground API Examples README Versions

Examples

View more examples

Run time and cost

This model costs approximately 0ドル.047 to run on Replicate, or 21 runs per 1,ドル but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 34 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Kandinsky 2.2

Kandinsky 2.2 brings substantial improvements upon its predecessor, Kandinsky 2.1, by introducing a new, more powerful image encoder - CLIP-ViT-G and the ControlNet support.

The switch to CLIP-ViT-G as the image encoder significantly increases the model’s capability to generate more aesthetic pictures and better understand text, thus enhancing the model’s overall performance.

The addition of the ControlNet mechanism allows the model to effectively control the process of generating images. This leads to more accurate and visually appealing outputs and opens new possibilities for text-guided image manipulation.

Model created

AltStyle によって変換されたページ (->オリジナル) /