-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Description
Hello!
The HunyuanVideoI2V PR broke down the HunyuanVideo example code in xDiT causing an error at this line:
https://github.com/xdit-project/xDiT/blob/main/examples/hunyuan_video_usp_example.py#L79
temb = self.time_text_embed(timestep, guidance, pooled_projections)
Before self.time_text_embed
had an instance of CombinedTimestepGuidanceTextProjEmbeddings
class but now it has HunyuanVideoConditionEmbedding
, and internals changed a bit at the same time (call signature, return variables, forward method itself). One can make the example work again by changing the above line to:
temb, token_replace_emb = self.time_text_embed(timestep, pooled_projections, guidance)
I did some quick testing with two prompts, and it seems that the new class produces slightly different outputs than the previous class, so it does not seem to work equivalently as before. This class change seems to be related to adopting the new I2V model to the existing diffusers code and one would assume that this shouldn't change the behaviour of T2V, unless there was a clear bug before. Interestingly though, the previous class still exists in diffusers (so I suppose one could change the self.time_text_embed
to that if needed?).
@a-r-r-o-w @yiyixuxu what do you think about these observations, could you share some more light to this PR from T2V point of view? Was the behaviour of self.time_text_embed
for T2V changed intentionally?