This kills like 17 birds with 1 stone. It allows displaying the proper &
/&mut
/*const
/*mut
type name for *-gnu
targets, and fixes a bunch of issues with visualizing *-msvc
targets.
In short, none of the debuggers (in their current states) respect the "name" field that is passed to LLVMRustDIBuilderCreatePointerType
. That field does appear in the DWARF
data, but GDB and LLDB don't care.
This patch wraps the pointer nodes (and msvc array type) in a typedef before finalizing them. Worth noting, this typedef trick is already being used for *-msvc
primitive types.
*-gnu
Mainly fixes type-name output. Screenshots should be self-explanatory.
GDB
GDB by default hard-codes pointer types to *mut
. This "fixes" that without requiring a code change in GDB, but that's mostly just a happy side effect.
image
LLDB
TypeSystemClang
ignores the name
field of the pointer in the DWARF
info. Using a typedef sidesteps that deficiency. We could maybe modify TypeSystemClang
so this isn't necessary, but since it relies on clang
(read: c/c++) compiler type representations under the hood, I'm not sure if pointers can have names and it's not really reasonable to change clang itself to accommodate rust.
image
*-msvc
As opposed to DWARF
, the name
field does not exist anywhere in the PDB
data. There are 2 reasons for this
-
Pointer nodes do not contain a name field
-
Primitive types are unique, special nodes that have an additional unique, special representation for pointer-to-primitive
The issue with this is with container types, for example Vec
. Vec<T>
's heap pointer is not a *mut T
, it's a *mut u8
that is cast to a T
when needed using (more or less) PhantomData<T>
. From the type's perspective, T
only "exists" in the generic parameters. That means the debugger, working from the type's perspective, must look it up by name, (e.g. Vec<ref$<u8> >
must look up the string "ref$<u8>").
Since those type names aren't in the PDB data, the lookup fails, the debugger cannot cast the heap pointer, and thus cannot visualize the elements of the container.
In LLDB, the sole arbiter of "what types exist" when doing a type lookup is the PDB data itself. I'm sure the msdia
works the same way, but LLDB's native PDB parser checks the type stream, and any pointer-node-to-T
is formatted C-style as T *
. This problem also affects Microsoft's debugger.
array$<T,N>
also needs a typedef, as arrays have a bespoke node whose "name" field is also ignored in favor of the C-style format (e.g. T[N]
). If you use Visual Sudio's natvis diagnostics logging, you can see errors such as this:
Natvis: C:\Users\ant_b\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\etc\liballoc.natvis(10,23): Error: identifier "array$<u32,7>" is undefined
Error while evaluating '(array$<u32,7>*)buf.inner.ptr.pointer.pointer' in the context of type 'sample.exe!alloc::vec::Vec<array$<u32,7>,alloc::alloc::Global>'.
LLDB (via CodeLLDB)
image
CDB via Visual Studio 2022
image
CDB via C/C++ extension in Visual Studio Code
The output is identical to Visual Studio, but I want to make a special note because i had to jump through a few hoops to get it to work correctly. Built with stable, it worked the same as the "Before" image from Visual Studio, but built with the patched compiler the Vec
visualizer wasn't working.
Clearly based on the Visual Studio "After" screenshot, the natvis files still work. If you binary-patch the extension so that it outputs verbose logging info it appears it never even tried to load liballoc.natvis
for some reason?
I manually placed the natvis files in C:\Users\<USER>\.vscode\extensions\ms-vscode.cpptools-1.26.3-win32-x64\debugAdapters\vsdbg\bin\Visualizers\
and it worked fine so iunno. Probably worth someone else testing too. Might also be because I'm only using a stage1 build instead of a full toolchain install? I'm not sure.
Alternatives
I tried some fiddling with using the reference
debug info node (which does have a valid counterpart in both DWARF
and PDB
). The issue is that LLDB uses TypeSystemClang
, which is very C/C++-oriented. In Rust, references are borrowing pointers. In C++ references are not objects, they are not guaranteed to have a representation at run-time. That means no array-of-refs, no ref-to-ref, no pointer-to-ref. LLDB seems to interpret ref-to-ref incorrectly. That can be worked around but the hack necessary for that is heinous and infects the visualizers too. It also means without the visualizers, the type-name output is sorta worse than it is now.
Uh oh!
There was an error while loading. Please reload this page.
This kills like 17 birds with 1 stone. It allows displaying the proper
&
/&mut
/*const
/*mut
type name for*-gnu
targets, and fixes a bunch of issues with visualizing*-msvc
targets.In short, none of the debuggers (in their current states) respect the "name" field that is passed to
LLVMRustDIBuilderCreatePointerType
. That field does appear in theDWARF
data, but GDB and LLDB don't care.This patch wraps the pointer nodes (and msvc array type) in a typedef before finalizing them. Worth noting, this typedef trick is already being used for
*-msvc
primitive types.*-gnu
Mainly fixes type-name output. Screenshots should be self-explanatory.
GDB
GDB by default hard-codes pointer types to
image*mut
. This "fixes" that without requiring a code change in GDB, but that's mostly just a happy side effect.LLDB
imageTypeSystemClang
ignores thename
field of the pointer in theDWARF
info. Using a typedef sidesteps that deficiency. We could maybe modifyTypeSystemClang
so this isn't necessary, but since it relies onclang
(read: c/c++) compiler type representations under the hood, I'm not sure if pointers can have names and it's not really reasonable to change clang itself to accommodate rust.*-msvc
As opposed to
DWARF
, thename
field does not exist anywhere in thePDB
data. There are 2 reasons for thisPointer nodes do not contain a name field
Primitive types are unique, special nodes that have an additional unique, special representation for pointer-to-primitive
The issue with this is with container types, for example
Vec
.Vec<T>
's heap pointer is not a*mut T
, it's a*mut u8
that is cast to aT
when needed using (more or less)PhantomData<T>
. From the type's perspective,T
only "exists" in the generic parameters. That means the debugger, working from the type's perspective, must look it up by name, (e.g.Vec<ref$<u8> >
must look up the string "ref$<u8>").Since those type names aren't in the PDB data, the lookup fails, the debugger cannot cast the heap pointer, and thus cannot visualize the elements of the container.
In LLDB, the sole arbiter of "what types exist" when doing a type lookup is the PDB data itself. I'm sure the
msdia
works the same way, but LLDB's native PDB parser checks the type stream, and any pointer-node-to-T
is formatted C-style asT *
. This problem also affects Microsoft's debugger.array$<T,N>
also needs a typedef, as arrays have a bespoke node whose "name" field is also ignored in favor of the C-style format (e.g.T[N]
). If you use Visual Sudio's natvis diagnostics logging, you can see errors such as this:LLDB (via CodeLLDB)
imageCDB via Visual Studio 2022
imageCDB via C/C++ extension in Visual Studio Code
The output is identical to Visual Studio, but I want to make a special note because i had to jump through a few hoops to get it to work correctly. Built with stable, it worked the same as the "Before" image from Visual Studio, but built with the patched compiler the
Vec
visualizer wasn't working.Clearly based on the Visual Studio "After" screenshot, the natvis files still work. If you binary-patch the extension so that it outputs verbose logging info it appears it never even tried to load
liballoc.natvis
for some reason?I manually placed the natvis files in
C:\Users\<USER>\.vscode\extensions\ms-vscode.cpptools-1.26.3-win32-x64\debugAdapters\vsdbg\bin\Visualizers\
and it worked fine so iunno. Probably worth someone else testing too. Might also be because I'm only using a stage1 build instead of a full toolchain install? I'm not sure.Alternatives
I tried some fiddling with using the
reference
debug info node (which does have a valid counterpart in bothDWARF
andPDB
). The issue is that LLDB usesTypeSystemClang
, which is very C/C++-oriented. In Rust, references are borrowing pointers. In C++ references are not objects, they are not guaranteed to have a representation at run-time. That means no array-of-refs, no ref-to-ref, no pointer-to-ref. LLDB seems to interpret ref-to-ref incorrectly. That can be worked around but the hack necessary for that is heinous and infects the visualizers too. It also means without the visualizers, the type-name output is sorta worse than it is now.