There are two arrays named A and B, they are corresponding to each other, and their space are allocated during the kernels running. the details of A and B are that A[i] is the position and B[i] is value.All the threads do the things below:
- If the current thread's data is in the arrays update B,
- Else expanding A and B, and insert the current thread's data into the arrays.
- The initial size of A and B are zero.
Is the upper implementing supported by CUDA?
-
Could you please clarify point #1?Vitality– Vitality2013年09月13日 08:15:28 +00:00Commented Sep 13, 2013 at 8:15
-
point #1 means that A[i] and B[i] store the position and value of the i-th element, current thread may update B[i], if the position of current thread's element is in array A.taoyuanjl– taoyuanjl2013年09月13日 08:46:53 +00:00Commented Sep 13, 2013 at 8:46
1 Answer 1
Concerning point #2, you would need something like C++'s realloc(), which, as long as I know, is not supported by CUDA. You can write your own realloc() according to this post
CUDA: Using realloc inside kernel
but I do not know how efficient will be this solution.
Alternatively, you should pre-allocate a "large" amount of global memory to be able to account for the worst case memory occupation scenario.
6 Comments
__syncthreads(), etc.) to handle arbitration within a threadblock.