-
Notifications
You must be signed in to change notification settings - Fork 564
Open
@haolujun
Description
Example 1 NOTE: the ordering by rank:
Rank 0 sends table_0, shard_0 to Rank 1.
Rank 2 sends table_1, shard_0 to Rank 1.
Rank 2 sends table_1, shard_1 to Rank 0
Rank 3 sends table_0, shard_1 to Rank 0
NOTE: table_1 comes first due to its source rank being 'first'
On Rank 0:output_tensor = [
<table_1, shard_0>, # from rank 2
<table_0, shard_1> # from rank 3
]
On Rank 1: output_tensor = [
<table_0, shard_0>, # from rank 0
<table_1, shard_0> # from rank 2
]
May be follow is right ?
On Rank 0: output_tensor = [
<table_1, shard_1>, # from rank 2
<table_0, shard_1> # from rank3
]
and if this is right, there are more errors with commit in torchrec/distributed/sharding/dynamic_sharding.py
Metadata
Metadata
Assignees
Labels
No labels