Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[feature] Support Flux TensorRT Pipeline #12218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vuongminh1907 wants to merge 17 commits into huggingface:main
base: main
Choose a base branch
Loading
from vuongminh1907:main

Conversation

Copy link
Contributor

@vuongminh1907 vuongminh1907 commented Aug 22, 2025
edited
Loading

What does this PR do?

This PR addresses issue #12202.
It introduces initial support for the FluxPipeline with TensorRT acceleration.

For installation steps and usage examples, please check the updated README.md.

Below are sample results comparing PyTorch (before TRT) vs TensorRT (after TRT):

🐱 Example 1:

Diffusers (Before TRT) TensorRT (After TRT)

👧 Example 2:

Diffusers (Before TRT) TensorRT (After TRT)

vuongminh1907 and others added 9 commits August 20, 2025 16:52
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Copy link
Contributor Author

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions. Thanks for starting this!

Comment on lines 71 to 74
engine_transformer_path = "path/to/transformer/engine_trt10.13.2.6.plan"
engine_vae_path = "path/to/vae/engine_trt10.13.2.6.plan"
engine_t5xxl_path = "path/to/t5/engine_trt10.13.2.6.plan"
engine_clip_path = "path/to/clip/engine_trt10.13.2.6.plan"
Copy link
Member

@sayakpaul sayakpaul Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to derive these? Can we host some of these files on the Hub and supplement in this project?

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These .plan files depend on the GPUs that we use. I only have them for H100 SXM, do you need these files to run?

We follow the official [NVIDIA/TensorRT](https://github.com/NVIDIA/TensorRT) repository to build TensorRT.

> **Note:**
> TensorRT was originally built with `diffusers==0.31.1`.
Copy link
Member

@sayakpaul sayakpaul Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TensorRT's build shouldn't depend on the diffusers version no?

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but in the TensorRT repo, it's used with this version. However, after building the TRT files, we can use my script for inference with the current diffusers version.

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe will have other ways for build or I will code a new one. This is just fast building.

> - one **venv** for building, and
> - another **venv** for inference.

(🔜 TODO: Build scripts for the latest `diffusers` will be added later.)
Copy link
Member

@sayakpaul sayakpaul Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is needed for this? Can we maybe comment on that so that other contributors could pick it up if interested?

vuongminh1907 reacted with thumbs up emoji
pip install -r requirements.txt
```

### ⚡ Fast Building with Static Shapes
Copy link
Member

@sayakpaul sayakpaul Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this can be run from the repository NVIDIA/TensorRT itself, then what is the purpose of this example?

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a guide for building .plan files, in case someone has no idea how to build them.

Copy link
Member

@sayakpaul sayakpaul Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh then I think it should live in the NVIDIA TensorRT repository as most of the examples seem to be taken from there.

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay @sayakpaul, I will remove it and maybe write code to build it in Diffusers in the near future.

Copy link
Contributor Author

hey @sayakpaul, I added code to build TRT. You can take a look. I think everything is working well now.

Comment on lines +52 to +55
You can convert all ONNX checkpoints to TensorRT engines with a single command:
```bash
python convert_trt.py
```
Copy link
Member

@sayakpaul sayakpaul Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need onnx checkpoints for TensorRT?

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, at REAME, I mentioned it in this line.

Copy link
Member

@sayakpaul sayakpaul Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to have the possibility of taking a diffusers checkpoint and then convert it to TensorRT.

Do you think that is possible?

Copy link
Contributor Author

@vuongminh1907 vuongminh1907 Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did it, but in my own way. I think I’ll code this for Diffusers and create a new PR.

Copy link
Member

@sayakpaul sayakpaul Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be actually amazing!

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor Author

hey @sayakpaul , did this check have sth wrong?

Copy link
Member

Not sure why did you close the PR.

Copy link
Contributor Author

Sorry, I’m not sure why this check failed. I asked you about it earlier but didn’t get a reply. I think there’s something wrong with my PR, so I closed it

Copy link
Member

@bot /style

Copy link
Contributor

github-actions bot commented Sep 11, 2025
edited
Loading

Style bot fixed some files and pushed the changes.

Copy link
Contributor Author

Thanks @sayakpaul for reopening! I’ll create a new PR for ONNX export once this one is merged into main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers

@sayakpaul sayakpaul sayakpaul left review comments

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /