-
Notifications
You must be signed in to change notification settings - Fork 7
Open
@zaptrem
Description
I tried to implement this for flow models as described in the appendix, but the results are complete collapse (exploding images). Did I make a mistake or is this technique fundamentally incompatible with flow models (which have no renoising step)? Also the paper doesn't define v lambda.
def euler_cfgpp_update( x_t: torch.Tensor, t: float, dt: float, v_u: torch.Tensor, v_c: torch.Tensor, lambda_val: float, ) -> Tensor: # Unconditional velocity at (x_t, t) # v_u = model_uncond(x_t, t) # Conditional velocity at (x_t, t) # v_c = model_cond(x_t, t) # Unconditional "Tweedie" estimate: x̃a(∅) = xt - t * v_u x_null = x_t + (1 - t) * v_u # Conditional "Tweedie" estimate: x̃a(c) = xt - t * v_c x_cond = x_t + (1 - t) * v_c # normal cfg prediction # x_cfg = x_t + (1 - t) * (v_u + 2.3 * (v_c - v_u)) # CFG++ "Tweedie" estimate (interpolation): # x̃a(λ) = (1-λ)* x̃a(∅) + λ * x̃a(c) x_cfgpp = x_null + lambda_val * (x_cond - x_null) # Next time = t + dt t_next = t + dt # Euler step for CFG++: # xt1 = x̃a(λ)(xt0) + ( xt - x̃a(∅)(xt0) ) / t0 * t1 # (Make sure t != 0 to avoid divide-by-zero!) # eps = 1e-12 x_next = x_cfgpp + (x_t - x_null) * ((1 - t_next) / (1 - t)) # vanilla cfg # x_next = x_cfg + (x_t - x_cfg) * ((1 - t_next) / (1 - (t + eps))) return x_next
Metadata
Metadata
Assignees
Labels
No labels