Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Distributed Data Parallel communication hook #667

davidtweedle started this conversation in General
Discussion options

Hello,
I am working on a simple idea for a submission to this contest. My idea requires a communication hook to be registered for the distributed data parallel model from pytorch. Essentially, I want to calculate the gradient, then perform some calculation on the gradient separately on each GPU, then all_reduce the results. I do not think that this violates the spirit of the rules, but please let me know if you agree. Thank you for your time.
-David Tweedle

You must be logged in to vote

Replies: 1 comment 2 replies

Comment options

Hi David,
It sounds like you're asking whether you can perform manipulations to the gradients per data shard. Is that correct?
If that is the question, I think that is within the spirit of the rules.

Also, have you checked out our submission API in https://github.com/mlcommons/algorithmic-efficiency/blob/main/submissions/template/submission.py and example implementations (https://github.com/mlcommons/algorithmic-efficiency/blob/main/reference_algorithms/paper_baselines/adamw/pytorch/submission.py#L93). The idea is that submitters are free to implement each of the submission APIs as they wish. Our workload loss functions return 'unreduced' loss values so I believe you should be able to compute and perform calculations on the gradients per shard.

@runame can you confirm?

You must be logged in to vote
2 replies
Comment options

runame Mar 2, 2024
Collaborator

Agreed, this should definitely be within the spirit of the rules.

Comment options

davidtweedle Mar 5, 2024
Collaborator Author

Yes, I want to perform calculations on each shard. I am able to see how I can do so after looking closer at the examples. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /