[WIP] Add Deepcache #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

rmatif wants to merge 5 commits into leejet:master

from rmatif:deepcache

Draft

[WIP] Add Deepcache #705

rmatif wants to merge 5 commits into leejet:master from rmatif:deepcache

Conversation

rmatif

Copy link

Contributor

@rmatif rmatif commented Jun 18, 2025 •

edited

Loading

This PR is currently in progress and far from complete. It adds DeepCache, a method that can be applied to U-Net architectures to skip certain blocks and reuse them in later steps in order to save compute time.

I have been inspired by this ComfyUI implementation.

It adds --deepcache interval,depth,start,stop arguments.

Currently, it's not working well and I can't figure out why or how to achieve better results. I have been debugging the cache step and counter logic for a week, but the issue seems to be more subtle than that.

Command example:

./build/bin/sd -m ../models/realisticVisionV60B1_v51HyperVAE.safetensors -v -p "cute cat" --cfg-scale 2.5 --steps 8 --deepcache 2,3,0,8

w/o deepcache	`--deepcache 2,3,0,8`	`--deepcache 3,3,0,8`
without-deepcache	interval2	interval3

If someone could help by taking a look or continue the work, I would be grateful. Otherwise, I don't think I'll spend more time on it.

rmatif and others added 5 commits

June 4, 2025 13:53

@rmatif


 first attempt

c29b4be

@rmatif


 some progress

53ce38e

@rmatif


 fixing graph logic

a5bbf1b

@rmatif


 del irrelevant comments

a8a6d66

@rmatif


 remove DeepCache comparison images

169188c

@FSSRepo

Copy link

Contributor

FSSRepo commented Jul 4, 2025

I'm very interested in this PR; I wish I had time to test DeepCache in Comfy UI and compare the results with your PR.

@rmatif

Copy link

Contributor Author

rmatif commented Jul 5, 2025

@FSSRepo Thanks for your interest! Here's a comparison with ComfyUI, using the same model and parameters as above.

w/o deepcache	`interval = 2, depth = 3, start = 0, stop = 8`	`interval = 3, depth = 3, start = 0, stop = 8`
wo_deepcache	int_2	int_3

The results are so much better in ComfyUI compared to what I’m getting. My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic. I’m definitely doing something wrong. I'd love to continue working on this, but I’ve run out of ideas. It would be great if you could take a look and share any feedback!

@stduhpf

Copy link

Contributor

stduhpf commented Jul 5, 2025

My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic.

My guess is that it's sharing the same cache between uncond and conditioned pass and it's probably not supposed to.

@rmatif

Copy link

Contributor Author

rmatif commented Jul 6, 2025

My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic.

My guess is that it's sharing the same cache between uncond and conditioned pass and it's probably not supposed to.

I tried to create a separate cache for conditional and unconditional passes, but it broke things even more. In any case, I think we should fix things with CFG first before addressing the CFG-free issue, don't think those are related

@FSSRepo

Copy link

Contributor

FSSRepo commented Jul 12, 2025

What is CFG?

According to my understanding, it's when we pass the --cfg-scale parameter. Why do they refer to it as something that's missing in this project?

Or is it a deepcache configuration?

@stduhpf

Copy link

Contributor

stduhpf commented Jul 12, 2025 •

edited

Loading

What is CFG?

CFG means Classifier-Free guidance. It's basically a way to change how much of an effect the prompt has for conditional generation by linearly extrapolating from the conditioned prediction away from the prediction without text conditioning (or with a negative prompt). So it needs 2 forward passes at each step: 1 with the positive prompt, and 1 with empty/negative prompt.

@rmatif

Copy link

Contributor Author

rmatif commented Jul 12, 2025 •

edited

Loading

What is CFG?

According to my understanding, it's when we pass the --cfg-scale parameter. Why do they refer to it as something that's missing in this project?

Or is it a deepcache configuration?

It's just the --cfg-scale. When you try to run inference with a CFG of 1, the results are significantly worse, almost garbage. It seems that the more steps it takes, the further off the output gets, as if some error is accumulating at each step.

I did try separating the cache between the conditional and unconditional passes, but that didn’t help and in fact, it broke the case where we run with CFG > 1. From my understanding, DeepCache operates at a higher level and shouldn't be affected by this conditional/unconditional distinction stuff. Something is seriously wrong here, but I can't quite put my finger on it

EDIT: I may have wrongly assumed that you're familiar with the concept of CFG, but @stduhpf already explained it well. Basically, during inference, you're doing:

final_prediction = prediction_unconditional + w * (prediction_conditional - prediction_unconditional)

When w = 1, you're effectively running only the conditional pass. That’s useful because it means you can double your inference speed, and distilled models support this approach. However, you do trade off some prompt fidelity when doing so.

I recently read a paper that concluded CFG might actually be useless. It only appears to work because we end up using twice the compute

@rmatif rmatif mentioned this pull request

Aug 22, 2025

Collaboration/Sponsorship: Improving SDXL Inference Performance in stable-diffusion.cpp #772

Open

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Add Deepcache #705

Are you sure you want to change the base?

[WIP] Add Deepcache #705

Uh oh!

Conversation

@rmatif rmatif commented Jun 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

FSSRepo commented Jul 4, 2025

Uh oh!

rmatif commented Jul 5, 2025

Uh oh!

stduhpf commented Jul 5, 2025

Uh oh!

rmatif commented Jul 6, 2025

Uh oh!

FSSRepo commented Jul 12, 2025

Uh oh!

stduhpf commented Jul 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

rmatif commented Jul 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[WIP] Add Deepcache #705

Are you sure you want to change the base?

[WIP] Add Deepcache #705

Uh oh!

Conversation

@rmatif rmatif commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FSSRepo commented Jul 4, 2025

Uh oh!

rmatif commented Jul 5, 2025

Uh oh!

stduhpf commented Jul 5, 2025

Uh oh!

rmatif commented Jul 6, 2025

Uh oh!

FSSRepo commented Jul 12, 2025

Uh oh!

stduhpf commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rmatif commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

@rmatif rmatif commented Jun 18, 2025 •

edited

Loading

stduhpf commented Jul 12, 2025 •

edited

Loading

rmatif commented Jul 12, 2025 •

edited

Loading