cygwin 3.6.0: No signals received after swapcontext() is used

Thu Mar 13 09:40:48 GMT 2025

Corinna Vinschen via Cygwin wrote:
> On Mar 12 17:06, Corinna Vinschen via Cygwin wrote:
>> On Mar 12 16:30, Corinna Vinschen via Cygwin wrote:
>>> On Mar 11 12:32, Christian Franke via Cygwin wrote:
>>>> The attached testcase should test the following use cases of setcontext:
>>>> - call from regular user space
>>>> - call from a signal handler interrupting user space
>>>> - call from a signal handler interrupting a system call
>>>>>>>> It works as expected ... until the signal count reaches 256. Then signals
>>>> are again only delivered from inside of a system call.
>>>> [...]
>>>> Interesting... Hmm... is there some 8-bit counter which overflows and then
>>>> stucks at 0xff or 0x00?
>>> It's a kind of stack overflow. Kind of, because it's not the normal
>>> thread stack, but a special signal stack in the _cygtls area.
>>>>>> When interrupting a running thread to call a signal handler, the context
>>> of the thread is changed to restart execution in an assembler function
>>> called sigdelayed(). The original IP of the thread is pushed on the
>>> aforementioned signal stack. Sigdelayed() calls the signal handler. On
>>> return it pops the original IP from the signal stack and continues the
>>> thread.
>>>>>> Now guess what happens if the signal handler bails out with longjmp or
>>> setcontext/swapcontext.
>>>>>> The signal handler never returns to the sigdelayed() function, the
>>> original address is never poped from the signal stack, and the signal
>>> stack has a max. size of 256 address entries...
>>>>>> Theoretically, a small update to sigdelayed() would fix the issue: ather
>>> then poing the original IP from the signal stack after calling the
>>> handler, it should pop the IP prior to calling the handler. That would
>>> avoid filling up the signal stack when long-jumping out of the signal
>>> handler. It should store the IP in one of the callee-saved registers.
>>> %r13 is unused in sigdelayed so far.
>>>>>> However, even if we do this, there's still the problem that sigdelayed()
>>> itself takes space on the stack. If you longjmp/setcontext out of the
>>> handler, the thread's normal stack will fill up with dead storage of the
>>> sigdelayed() function, and there's no way out of this trap. We can't
>>> restore the stack before the handler returns.
>>>>>> So either way, at one point you get a stack overflow one way or the
>>> other.
>>>>>> The signal stack overflow is actually rather harmless in comparison
>>> to a real stack overflow.
>>>>>> If you have any idea how to avoid the real stack overflow, I'd be
>>> all ears.
>> Looks like this isn't really a problem with setcontext. It always
>> corrects the stack pointer as well. Apparently I haven't thought
>> long enough about this.
>>>> I have a patch for sigdelayed() in the loop, stay tuned.
> Just pushed. Try cygwin-3.6.0-0.430.ga942476236b5 in a bit.

Problem does no longer occur. Also tested with 'kill -INT PID && sleep 
0.01' in a loop.
-- 
Thanks,
Christian