add fast inference tutorial #1948

Avoid including large-size files in the PR.
Clean up long text outputs from code cells in the notebook.
For security purposes, please check the contents and remove any sensitive info such as user names and private key.
Ensure (1) hyperlinks and markdown anchors are working (2) use relative paths for tutorial repo files (3) put figure and graphs in the ./figure folder
Notebook runs automatically ./runner.sh -t <path to .ipynb file>

@yiheng-wang-nv


 add fast inference tutorial

632c212

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

@review-notebook-app

Copy link

review-notebook-app bot commented Feb 28, 2025

Check out this pull request on ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

@pre-commit-ci


 [pre-commit.ci] auto fixes from pre-commit.com hooks

d2873ec

for more information, see https://pre-commit.ci

@yiheng-wang-nv yiheng-wang-nv mentioned this pull request

Feb 28, 2025

racing condition when InvertD is used along with ThreadDataLoader Project-MONAI/MONAI#8056

Closed

@ericspod

Copy link

Member

ericspod commented Mar 2, 2025

This addresses #1865 I assume.

yiheng-wang-nv added 2 commits

March 7, 2025 12:51

@yiheng-wang-nv


 rewrite with liver and whole body ct seg

9f229ea

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

@yiheng-wang-nv


 Merge branch 'add-infer-accelerate-tutorial' of github.com:yiheng-wan...

33f9bfc

...g-nv/tutorials into add-infer-accelerate-tutorial

@yiheng-wang-nv yiheng-wang-nv marked this pull request as ready for review

March 7, 2025 04:54

pre-commit-ci bot and others added 2 commits

March 7, 2025 04:55

@pre-commit-ci


 [pre-commit.ci] auto fixes from pre-commit.com hooks

5bd3f67

for more information, see https://pre-commit.ci

@yiheng-wang-nv


 add scripts and update notebook

45b5da3

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

@yiheng-wang-nv yiheng-wang-nv requested review from KumoLiu, Nic-Ma and ericspod

March 7, 2025 09:44

@yiheng-wang-nv

Copy link

Contributor Author

yiheng-wang-nv commented Mar 7, 2025

HI @ericspod @Nic-Ma @KumoLiu , the tutorial is almost ready, could you help to review the layout first?
I will add analyze and visualization results later

pre-commit-ci bot and others added 4 commits

March 7, 2025 09:46

@pre-commit-ci


 [pre-commit.ci] auto fixes from pre-commit.com hooks

dc1c24f

for more information, see https://pre-commit.ci

@yiheng-wang-nv


 finalize report

f4840a7

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

@yiheng-wang-nv


 Merge branch 'add-infer-accelerate-tutorial' of github.com:yiheng-wan...

8635edc

...g-nv/tutorials into add-infer-accelerate-tutorial

@pre-commit-ci


 [pre-commit.ci] auto fixes from pre-commit.com hooks

82daa32

for more information, see https://pre-commit.ci

@yiheng-wang-nv

Copy link

Contributor Author

yiheng-wang-nv commented Mar 8, 2025

HI @ericspod @Nic-Ma @KumoLiu , the tutorial is almost ready, could you help to review the layout first? I will add analyze and visualization results later

tutorial is ready.

@yiheng-wang-nv


 fix pep8

6d650ef

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

KumoLiu

KumoLiu reviewed

Mar 19, 2025

View reviewed changes

acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb Outdated Show resolved Hide resolved

KumoLiu

KumoLiu reviewed

Mar 19, 2025

View reviewed changes

acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb

@@ -0,0 +1,635 @@

{

Copy link

Contributor

@KumoLiu KumoLiu Mar 19, 2025 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can also include the .nii.gz benchmark result in the notebook since the original data is nii.gz format.

Reply via ReviewNB

Copy link

Contributor Author

@yiheng-wang-nv yiheng-wang-nv Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @KumoLiu , thanks for the suggestion. .nii.gz files have to be decompressed in CPU, thus using GDS may not have acceleration. I added a section to introduces the limitations on each feature, could you help to review the updates? Thanks!

yiheng-wang-nv and others added 3 commits

March 20, 2025 11:41

@yiheng-wang-nv @KumoLiu


 Update acceleration/fast_inference_tutorial/fast_inference_tutorial.i...

bcdfe95

...pynb
Co-authored-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
Signed-off-by: Yiheng Wang <68361391+yiheng-wang-nv@users.noreply.github.com>

@yiheng-wang-nv


 Merge branch 'main' into add-infer-accelerate-tutorial

d1e27a3

@yiheng-wang-nv


 update doc

1dccf63

Signed-off-by: Yiheng Wang <vennw@nvidia.com>

@yiheng-wang-nv yiheng-wang-nv force-pushed the add-infer-accelerate-tutorial branch from 58d80e3 to 1dccf63 Compare

March 24, 2025 08:26

Nic-Ma

Nic-Ma approved these changes

Mar 26, 2025

View reviewed changes

Copy link

Contributor

@Nic-Ma Nic-Ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the detailed tutorial, it overall looks good to me.
Do you plan to add the INT8/INT4 quantization in this PR or a separate PR later?

Thanks.

acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb Show resolved Hide resolved

@yiheng-wang-nv

Copy link

Contributor Author

yiheng-wang-nv commented Mar 26, 2025

acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb

Hi @Nic-Ma , thanks for the suggestion. I think we can consider adding quantization in a separate PR. Before adding it, it may need some time to:

prove it's faster
prove there will not have too much accuracy loss.

@Nic-Ma

Copy link

Contributor

Nic-Ma commented Mar 26, 2025

Plan sounds good to me.

Thanks.

ericspod

ericspod reviewed

Mar 28, 2025

View reviewed changes

acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb

Comment on lines +399 to +402

"```bash\n",

"for benchmark_type in \"original\" \"trt\" \"trt_gpu_transforms\" \"trt_gds_gpu_transforms\"; do\n",

" python run_benchmark.py --benchmark_type \"$benchmark_type\"\n",

"done\n",

Copy link

Member

@ericspod ericspod Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could instead put this into a cell with %%bash at the top to allow users to run command, or you could do it with Python more directly for those that don't have bash:

for benchmark_type in ("original", "trt", "trt_gpu_transforms", "trt_gds_gpu_transforms"):
 !python run_benchmark.py --benchmark_type {benchmark_type}

Copy link

Member

@ericspod ericspod Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also state here that the script contains the same code as what's in this notebook, running it will generate a csv with the results for each type, but if the user wants to run the benchmark here in this notebook then can run the following cell with the commented lines uncommented.

@ericspod

Copy link

Member

ericspod commented Mar 28, 2025

I've looked at the tutorial and it all looks good to me, however I am wondering about what the results show. It seems to me that GDS has the most impact so the example is just IO bound, using TRT or not has little impact. This is good to demonstrate how to overcome such issues, but it seems to me that the model is so small that it's not relevant to the benchmarks you're showing. If you used a much larger model with many more parameters the actual inference time itself would be significant. Since the inference results aren't considered you could just use a randomly initialised model so you don't need to load pre-trained weights. Thoughts?

@yiheng-wang-nv

Copy link

Contributor Author

yiheng-wang-nv commented Apr 11, 2025

I've looked at the tutorial and it all looks good to me, however I am wondering about what the results show. It seems to me that GDS has the most impact so the example is just IO bound, using TRT or not has little impact. This is good to demonstrate how to overcome such issues, but it seems to me that the model is so small that it's not relevant to the benchmarks you're showing. If you used a much larger model with many more parameters the actual inference time itself would be significant. Since the inference results aren't considered you could just use a randomly initialised model so you don't need to load pre-trained weights. Thoughts?

Thanks @ericspod for the suggestions, I will use a more suitable model to show these features, and then update the PR

@ericspod

Copy link

Member

ericspod commented Jun 27, 2025

Hi @yiheng-wang-nv we'd like to get this tutorial through, do we have any progress on using a different model to demonstrate speedup better? Thanks!

@yiheng-wang-nv

Copy link

Contributor Author

yiheng-wang-nv commented Aug 25, 2025

Hi @yiheng-wang-nv we'd like to get this tutorial through, do we have any progress on using a different model to demonstrate speedup better? Thanks!

Hi @ericspod , thanks for the notice. Sorry for late reply, I will do some updates later

@ericspod


 Merge branch 'main' into add-infer-accelerate-tutorial

72d6143

Labels

None yet

4 participants

@yiheng-wang-nv @ericspod @Nic-Ma @KumoLiu

add fast inference tutorial #1948

Are you sure you want to change the base?

add fast inference tutorial #1948

Uh oh!

Conversation

@yiheng-wang-nv yiheng-wang-nv commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checks

Uh oh!

review-notebook-app bot commented Feb 28, 2025

Uh oh!

ericspod commented Mar 2, 2025

Uh oh!

yiheng-wang-nv commented Mar 7, 2025

Uh oh!

yiheng-wang-nv commented Mar 8, 2025

Uh oh!

Uh oh!

@KumoLiu KumoLiu Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

@yiheng-wang-nv yiheng-wang-nv Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

@Nic-Ma Nic-Ma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yiheng-wang-nv commented Mar 26, 2025

Uh oh!

Nic-Ma commented Mar 26, 2025

Uh oh!

@ericspod ericspod Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

@ericspod ericspod Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

ericspod commented Mar 28, 2025

Uh oh!

yiheng-wang-nv commented Apr 11, 2025

Uh oh!

ericspod commented Jun 27, 2025

Uh oh!

yiheng-wang-nv commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

@yiheng-wang-nv yiheng-wang-nv commented Feb 28, 2025 •

edited

Loading

@KumoLiu KumoLiu Mar 19, 2025 •

edited

Loading