Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Kandinsky 5 10 sec (NABLA suport) #12520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leffff wants to merge 103 commits into huggingface:main
base: main
Choose a base branch
Loading
from leffff:main
Open

Kandinsky 5 10 sec (NABLA suport) #12520

leffff wants to merge 103 commits into huggingface:main from leffff:main

Conversation

@leffff
Copy link
Contributor

@leffff leffff commented Oct 21, 2025

This PR adds support for 10 sec Kandinsky 5.0 model herd.

import torch
from diffusers import Kandinsky5T2VPipeline
from diffusers.utils import export_to_video
# Load the pipeline
pipe = Kandinsky5T2VPipeline.from_pretrained(
 "ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers", 
 torch_dtype=torch.bfloat16
)
pipe = pipe.to("cuda")
# Generate video
prompt = [
 "Photorealistic closeup video of two intricately detailed pirate ships locked in a fierce battle, complete with cannon fire and billowing sails, as they sail through the swirling waters of a steaming cup of coffee. The ships are miniature but highly realistic, with wooden textures and flags fluttering in the liquid breeze. Coffee splashes and foam ripple around them as they maneuver through the turbulent surface, dodging each other's attacks. A detailed reflection of the battle appears on the glossy surface of the coffee, adding to the dynamic realism. The camera pans and zooms to capture every dramatic moment of the high-seas clash within this tiny, unexpected world.",
 "Bad quality",
]
negative_prompt = "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards"
pipe.transformer.set_attention_backend("flex")
output = pipe(
 prompt=prompt,
 negative_prompt=negative_prompt,
 height=512,
 width=768,
 num_frames=241,
 num_inference_steps=50,
 guidance_scale=5.0,
 num_videos_per_prompt=1,
 generator=torch.Generator(42)
)
output.12.mp4

tolgacangoz, MeiYi-dev, and sayakpaul reacted with hooray emoji tolgacangoz and MeiYi-dev reacted with rocket emoji
leffff and others added 30 commits October 4, 2025 10:10
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Copy link
Member

@leffff let's add the tests and docs as well.

Copy link
Collaborator

ok, let's just use this PR to add docs and tests?

Copy link
Contributor Author

leffff commented Oct 22, 2025

Okay

Copy link
Contributor Author

leffff commented Oct 23, 2025

Please checkout the docs

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Collaborator

@bot /style

Copy link
Contributor

github-actions bot commented Oct 23, 2025
edited
Loading

Style bot fixed some files and pushed the changes.

Copy link
Contributor Author

leffff commented Oct 24, 2025

@yiyixuxu plz check the new docs version!

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really good! thanks!

leffff reacted with hooray emoji
Copy link
Member

@leffff could you also add kandinsky_v5 to _toctree.yml?

Copy link
Contributor Author

leffff commented Oct 24, 2025

Okay!

Copy link
Contributor Author

leffff commented Oct 24, 2025

Copy link
Contributor Author

leffff commented Oct 25, 2025

Please review and merge!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We should also add tests. Could you please do that too?

@stevhliu please also review the docs.

Copy link
Contributor Author

leffff commented Oct 25, 2025

Okay!

Copy link
Contributor Author

leffff commented Oct 27, 2025

Please check tests

self.assertEqual(output_with_embeds.shape, output_with_prompt.shape)


@slow
Copy link
Member

@sayakpaul sayakpaul Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave this one out for now.

max_diff = np.abs(output.detach().cpu().numpy() - output_loaded.detach().cpu().numpy()).max()
self.assertLess(max_diff, 1e-4)

def test_prompt_embeds(self):
Copy link
Member

@sayakpaul sayakpaul Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to test it with:

def test_encode_prompt_works_in_isolation(self, extra_required_param_value_dict=None, atol=1e-4, rtol=1e-4):

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Just two minor comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@yiyixuxu yiyixuxu yiyixuxu approved these changes

@sayakpaul sayakpaul sayakpaul approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /