-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
-
Issue with RepeatAugSampler: Augmented Versions May End Up in the Same Process
Hello,
I've been working with the RepeatAugSampler, and I have some doubts regarding the following comment in its code:
It ensures that each augmented version of a sample will be visible to a different process (GPU)
From my understanding, this statement does not always hold true—particularly in configurations where num_replicas < num_repeats. For example, if num_replicas = 2 and num_repeats = 3, multiple augmented versions of the same image may end up assigned to the same process. In such cases, these versions can appear consecutively in a batch.
This can become problematic when using augmentation techniques like Mixup or CutMix, as they may end up mixing different augmentations of the same image. I assume this is not ideal, since the intent of these techniques is to mix semantically different samples to improve generalization.
Questions:
- Is my understanding correct that
RepeatAugSamplerdoes not prevent this behavior whennum_replicas < num_repeats? - If so, is there a recommended strategy to avoid this? One idea I had was to permute the batch after sampling but before applying Mixup/CutMix, in a way that minimizes the chance of mixing augmented versions of the same original sample.
I would appreciate any insights or suggestions!
Thanks!
Beta Was this translation helpful? Give feedback.