0

There are two arrays named A and B, they are corresponding to each other, and their space are allocated during the kernels running. the details of A and B are that A[i] is the position and B[i] is value.All the threads do the things below:

  1. If the current thread's data is in the arrays update B,
  2. Else expanding A and B, and insert the current thread's data into the arrays.
  3. The initial size of A and B are zero.

Is the upper implementing supported by CUDA?

talonmies
72.7k35 gold badges204 silver badges296 bronze badges
asked Sep 13, 2013 at 7:31
2
  • Could you please clarify point #1? Commented Sep 13, 2013 at 8:15
  • point #1 means that A[i] and B[i] store the position and value of the i-th element, current thread may update B[i], if the position of current thread's element is in array A. Commented Sep 13, 2013 at 8:46

1 Answer 1

1

Concerning point #2, you would need something like C++'s realloc(), which, as long as I know, is not supported by CUDA. You can write your own realloc() according to this post

CUDA: Using realloc inside kernel

but I do not know how efficient will be this solution.

Alternatively, you should pre-allocate a "large" amount of global memory to be able to account for the worst case memory occupation scenario.

answered Sep 13, 2013 at 8:09
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks a lot! another question is how to guarantee atomic operation? if there are one more threads update A and B at the same time.
Have a look at the CUDA C Programming Guide, Section B.11. There you will find information on how using atomic operations in CUDA.
the atomic operations in Section B.11 are used for exact number of global memory or shared memory, such as B[i]; I want to guarantee atomic operation for the whole array, such as the other thread is refused to access the array while one thread is accessing the array.
You might consider using a critical section to control access to the array but there are challenges and difficulties. Search on cuda critical section in the upper right corner
Yes, I mentioned there would be challenges and difficulties. You could consider using a critical section to manage inter-block access, while using the ordinary threadblock communications methods (shared memory, __syncthreads(), etc.) to handle arbitration within a threadblock.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.