To further summarize the problem:
1. Have a c++ working thread that gets it's work chunk from a queue
2. Wrap all the internals in a c++ function callable from Lua
a. enqueue the work chunk
b. yield Lua
c. wait for the results from thread (we will get signaled)
d. resume Lua by invoking lua_resume
3. The writer of the Lua script must not know anything about the internals of the call (we must not force him to make a yield/resume from Lua)