debugging hangs in high -j rust builds

Jeremy Drake cygwin@jdrake.com
Fri Sep 19 20:36:52 GMT 2025


I'm trying to debug hangs during rust builds with high -j values.
1. there's a hung "rustc.exe", which I cannot attach gdb to with gdb
--pid. I attached with windbg, and it timed out and used some alternate
means to attach because the loader lock was held. Is there some way to
get gdb to attach in that way?
What windbg showed me is that this rustc is stuck in
cygheap_fixup_in_child, in an infinite loop walking through the
cygheap->chain. It appears to be stuck on a value 0x0000000800043710
whose prev is 0x0000000800043920 whose prev is 0x0000000800043710 ...
There are invisible (to ps) cargo.exe processes. One of them shows up in
pstree as ?(1)─┬─?(2287)───rustc Attaching to 2287 with gdb shows a
fairly normal cargo.exe with a bunch of threads, but ultimately waiting on
a child. The other cargo.exe I was able to attach to with gdb --pid but
giving the winpid (didn't know I could do that but gave it a try and it
worked!) Here's where that one is:
(gdb) bt
#0 0x00007ffa705917a4 in ntdll!ZwWaitForMultipleObjects () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x00007ffa6dc36329 in WaitForMultipleObjectsEx () from /cygdrive/c/Windows/System32/KERNELBASE.dll
#2 0x00007ffa6dc3622e in WaitForMultipleObjects () from /cygdrive/c/Windows/System32/KERNELBASE.dll
#3 0x00007ff9ae662a46 in child_info::sync (this=0x7ff9ae84b940 <ch_spawn>, this@entry=0x0, pid=8220, hProcess=@0x7ffa0ac40: 0x63a8, howlong=howlong@entry=4294967295)
 at ../../../../winsup/cygwin/sigproc.cc:1138
#4 0x00007ff9ae666e51 in child_info_spawn::worker (this=<optimized out>, this@entry=0x7ff9ae84b940 <ch_spawn>, mode=3, mode@entry=4099,
 prog_arg=prog_arg@entry=0x80003fac0 "/home/user/rust/rust-1.90.0-1.x86_64/build/build-Cygwin/bootstrap/debug/rustc", args=...) at ../../../../winsup/cygwin/spawn.cc:924
#5 0x00007ff9ae66851a in spawnve (mode=4099, path=0x80003fac0 "/home/user/rust/rust-1.90.0-1.x86_64/build/build-Cygwin/bootstrap/debug/rustc", argv=0xa0bb2b100, envp=<optimized out>)
 at ../../../../winsup/cygwin/spawn.cc:1033
#6 0x00007ff9ae603be0 in execvp (file=<optimized out>, argv=0xa0bb2b100) at ../../../../winsup/cygwin/exec.cc:94
#7 0x00007ff9ae73cc64 in _sigfe () at sigfe.s:35
#8 0x000000010152c72c in std::sys::process::unix::unix::<impl std::sys::process::unix::common::Command>::do_exec ()
#9 0x000000010152c2e8 in std::sys::process::unix::unix::<impl std::sys::process::unix::common::Command>::spawn ()
#10 0x0000000100f63e29 in <cargo_util::process_builder::ProcessBuilder>::exec_with_streaming ()
#11 0x000000010078479c in <cargo::core::compiler::DefaultExecutor as cargo::core::compiler::Executor>::exec ()
#12 0x00000001007691a7 in cargo::core::compiler::rustc::{closure#3} ()
#13 0x0000000100aa5db5 in <<cargo::core::compiler::job_queue::job::Work>::then::{closure#0} as core::ops::function::FnOnce<(&cargo::core::compiler::job_queue::job_state::JobState,)>>::call_once::{shim:vtable#0} ()
#14 0x0000000100aa5db5 in <<cargo::core::compiler::job_queue::job::Work>::then::{closure#0} as core::ops::function::FnOnce<(&cargo::core::compiler::job_queue::job_state::JobState,)>>::call_once::{shim:vtable#0} ()
#15 0x0000000100ab9791 in <cargo::core::compiler::job_queue::job_state::JobState>::run_to_finish ()
#16 0x00000001009f5dd3 in std::sys::backtrace::__rust_begin_short_backtrace::<<cargo::core::compiler::job_queue::DrainState>::run::{closure#1}, ()> ()
#17 0x00000001009fd80b in <<std::thread::Builder>::spawn_unchecked_<<cargo::core::compiler::job_queue::DrainState>::run::{closure#1}, ()>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vt
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) define plist
Type commands for definition of "plist".
End with a line saying just "end".
>set var $n = $arg0
>while $n != 0x0
 >printf "%p\n", $n
 >set var $n = $n->prev
 >end
>end
(gdb) plist cygheap->chain
0x800045370
0x800045330
0x800045260
0x8000451f0
0x800045180
0x800045140
0x8000450b0
0x800045080
0x800045030
0x800044fe0
0x800044fa0
0x800044f60
0x800044f20
0x800044ed0
0x800044e60
0x800044e10
0x800044dd0
0x800044d90
0x800044d50
0x800044d20
0x800044cf0
0x800044cc0
0x800044c50
0x800044ac0
0x800044a50
0x8000448c0
0x800044830
0x8000447a0
0x800044710
0x800044680
0x800044570
0x800044260
0x8000441f0
0x800044180
0x800044110
0x800044040
0x800043fd0
0x800043dc0
0x800043cb0
0x800043c40
0x800043bd0
0x800043b00
0x800043a30
0x800043920
0x800043710
0x800043920
0x800043710
0x800043920
...
I was thinking maybe the cygheap needed to be locked during spawn (I was
looking at 3.6 code), but it
was added to be locked in main already and I was running a 3.7.0 snapshot
that contained that fix. Maybe it needs to be locked during fork also? I
don't know how else the chain would be circular like that... The chain in
the waiting parent is fine and contains 43710 and 43920
...
0x800043b30
0x800043920
0x800043710
0x800043500
0x8000432f0
...
fork.cc does contain a refresh_cygheap in frok::parent


More information about the Cygwin mailing list

AltStyle によって変換されたページ (->オリジナル) /