Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[WIP] Add Deepcache #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
rmatif wants to merge 5 commits into leejet:master
base: master
Choose a base branch
Loading
from rmatif:deepcache
Draft

[WIP] Add Deepcache #705

rmatif wants to merge 5 commits into leejet:master from rmatif:deepcache

Conversation

Copy link
Contributor

@rmatif rmatif commented Jun 18, 2025
edited
Loading

This PR is currently in progress and far from complete. It adds DeepCache, a method that can be applied to U-Net architectures to skip certain blocks and reuse them in later steps in order to save compute time.

I have been inspired by this ComfyUI implementation.

It adds --deepcache interval,depth,start,stop arguments.

Currently, it's not working well and I can't figure out why or how to achieve better results. I have been debugging the cache step and counter logic for a week, but the issue seems to be more subtle than that.

Command example:

./build/bin/sd -m ../models/realisticVisionV60B1_v51HyperVAE.safetensors -v -p "cute cat" --cfg-scale 2.5 --steps 8 --deepcache 2,3,0,8
w/o deepcache --deepcache 2,3,0,8 --deepcache 3,3,0,8
without-deepcache interval2 interval3

If someone could help by taking a look or continue the work, I would be grateful. Otherwise, I don't think I'll spend more time on it.

fszontagh, stduhpf, vupjing, Green-Sky, vmobilis, FSSRepo, lin72h, and yeahdongcn reacted with thumbs up emoji
Copy link
Contributor

FSSRepo commented Jul 4, 2025

I'm very interested in this PR; I wish I had time to test DeepCache in Comfy UI and compare the results with your PR.

Copy link
Contributor Author

rmatif commented Jul 5, 2025

@FSSRepo Thanks for your interest! Here's a comparison with ComfyUI, using the same model and parameters as above.

w/o deepcache interval = 2, depth = 3, start = 0, stop = 8 interval = 3, depth = 3, start = 0, stop = 8
wo_deepcache int_2 int_3

The results are so much better in ComfyUI compared to what I’m getting. My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic. I’m definitely doing something wrong. I'd love to continue working on this, but I’ve run out of ideas. It would be great if you could take a look and share any feedback!

Copy link
Contributor

stduhpf commented Jul 5, 2025

My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic.

My guess is that it's sharing the same cache between uncond and conditioned pass and it's probably not supposed to.

Copy link
Contributor Author

rmatif commented Jul 6, 2025

My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic.

My guess is that it's sharing the same cache between uncond and conditioned pass and it's probably not supposed to.

I tried to create a separate cache for conditional and unconditional passes, but it broke things even more. In any case, I think we should fix things with CFG first before addressing the CFG-free issue, don't think those are related

Copy link
Contributor

FSSRepo commented Jul 12, 2025

What is CFG?

According to my understanding, it's when we pass the --cfg-scale parameter. Why do they refer to it as something that's missing in this project?

Or is it a deepcache configuration?

Copy link
Contributor

stduhpf commented Jul 12, 2025
edited
Loading

What is CFG?

CFG means Classifier-Free guidance. It's basically a way to change how much of an effect the prompt has for conditional generation by linearly extrapolating from the conditioned prediction away from the prediction without text conditioning (or with a negative prompt). So it needs 2 forward passes at each step: 1 with the positive prompt, and 1 with empty/negative prompt.

Copy link
Contributor Author

rmatif commented Jul 12, 2025
edited
Loading

What is CFG?

According to my understanding, it's when we pass the --cfg-scale parameter. Why do they refer to it as something that's missing in this project?

Or is it a deepcache configuration?

It's just the --cfg-scale. When you try to run inference with a CFG of 1, the results are significantly worse, almost garbage. It seems that the more steps it takes, the further off the output gets, as if some error is accumulating at each step.

I did try separating the cache between the conditional and unconditional passes, but that didn’t help and in fact, it broke the case where we run with CFG > 1. From my understanding, DeepCache operates at a higher level and shouldn't be affected by this conditional/unconditional distinction stuff. Something is seriously wrong here, but I can't quite put my finger on it

EDIT: I may have wrongly assumed that you're familiar with the concept of CFG, but @stduhpf already explained it well. Basically, during inference, you're doing:

final_prediction = prediction_unconditional + w * (prediction_conditional - prediction_unconditional)

When w = 1, you're effectively running only the conditional pass. That’s useful because it means you can double your inference speed, and distilled models support this approach. However, you do trade off some prompt fidelity when doing so.

I recently read a paper that concluded CFG might actually be useless. It only appears to work because we end up using twice the compute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /