I have been starting my development with GPU using OpenCL. I have been playing around with code which pushes limits.
During this I have been running into the situation where the computation time on the GPU is relatively long which results in the GUI becoming unresponsive and/or the GPU task takes so long that the device driver is reset.
While I understand why this happens and I am not looking for and explanation of why, what I am hoping to understand is how far can I push computation with a GPU which is being used by the system for GUI operations.
Is there any guide lines/best practices of this type of interactions
Is there any programming methods which would allow for long running GPU computation and still allow the GUI to remain responsive.
I know that the basic recommendation would be to split up the GPU task to be relatively small I am assuming that this is not possible, since I am exploring limits of GPU programming.
Any online discussions would be very useful.
Jim K
-
you are probably interested in asynchronous programming, that's the usual "solution" for keeping your application responsive while performing some tasks in the background. It depends on what languages you are using, in C++ there is boost or the latest C++11 standard and both offer support for async tasks/methods.user2485710– user24857102013年08月19日 00:16:42 +00:00Commented Aug 19, 2013 at 0:16
-
Sorry if I did not make it clear. Is the window itself which is becoming unresponsive not my GUI. In fact I am using the command lineJim Kramer– Jim Kramer2013年08月19日 01:03:09 +00:00Commented Aug 19, 2013 at 1:03
2 Answers 2
To answer your question, no there is nothing you can do to achieve your goal of having a long running kernel and maintain a functioning GUI all on one GPU. If you want long running kernels and a functioning GUI, you must use a dedicated GPU for computing. If you want a responsive GUI while doing computations on the same GPU, you must have short running kernels. You could complain every week on the AMD or Nvidia forums begging for this feature.
The only platform independent way to divide your work that comes to mind is to limit the amount of work sent to the GPU so that it finishes in something like 1/60th of a second (for 60Hz screens) and include a sleep command that puts the CPU thread to sleep for a short while so other applications can send tasks to the GPU. You may have to adjust that time limit to find something that does not affect the user.
1 Comment
One solution is to use two display devices: one for the OS and another for computation. But there are benefits to breaking a long run. For example, suppose a GPU task will take 10 days. How do you know the GPU task is really running properly during that 10 day period? Breaking up the task into segments of a few seconds allows you to add progress reporting capability to the controlling program. Breaking up the task also allows the controlling program to implement a periodic state save feature for resuming after a power failure. If you want to use multiple GPUs to further accelerate the computation, then it is essential that the task is broken into smaller segments. A small segment of work can be given to each GPU as it finishes the previous segment. That way, all GPUs will remain fully loaded until the task is complete. If instead the task is divided into large portions for each GPU, then it will be difficult or impossible to size the pieces so that the GPUs all complete at the same time.
I believe most GPU workloads can be broken up into segments of a few seconds each without any significant performance loss. So in this sense, breaking up the task does not detract from the goal of 'pushing the limits' of GPU computation. If the controlling program dispatches work continuously to the GPU used by the OS display, it may still impact the responsiveness of the OS display. A solution to this problem that does not reduce performance is to access the the machine remotely, using Remote Desktop, VNC or similar.