RFC: stack trace generation

Mon Jul 19 18:50:00 GMT 2004

>>>>> "Bryce" == Bryce McKinlay <mckinlay@redhat.com> writes:

Bryce> Casey Marshall wrote:
>> What I'm trying now is making the file/line number lookup code do
>> so for the stack as a whole -- each `lookup' method in NameFinder
>> would receive the entire stack trace and an array of objects to
>> store the info it finds for each frame, skipping frames that have
>> already been filled in. Then I call each in a precedence order --
>> interpreted, DWARF-2 info, dladdr, and finally (not implemented
>> yet) addr2line.
>>>> This makes the DWARF-2 part much faster, since it can optimize out
>> of having to read things more than once. It also has room for
>> inserting other methods for finding debug info; it isn't
>> "pluggable", but it could be.
>>Bryce> Perhaps the best thing to do would be to make Dwarf2NameFinder
Bryce> a class, and have it remember whatever information/state about
Bryce> each .so is necessary to speed up repeated lookups. This would
Bryce> work better than passing an entire stack trace to the
Bryce> NameFinder, because different types of NameFinder are required
Bryce> for different frames - eg interpreted frames, and the logic to
Bryce> determine which name finder to use doesn't belong in the
Bryce> NameFinder itself. I'd like to keep NameFinder's role simple.
Good idea. My code is littered with `if (stack[i].interp != NULL)',
which I'd rather not have.
What giving the DWARF-2 lookup code the whole stack buys you is the
ability to fill in frame info as it is found; you are looking for a
specific address in the stack, but if you find others you fill them in
too. Cacheing would effectively do the same thing, though.
Bryce> What I do currently (not necessarily the best solution) is to
Bryce> instantiate a NameFinder once for every
Bryce> Throwable.printStackTrace() call. The NameFinder.lookup()
Bryce> method is then called once for each frame with the IP for that
Bryce> frame. The namefinder determines which binary contains the
Bryce> address, using dladdr(), and starts an addr2line instance for
Bryce> that binary if necessary. The addr2line instances are kept open
Bryce> until close() is called on the namefinder once the bottom of
Bryce> the stack is reached. This means that the repeated lookup()
Bryce> calls do not each cause a new addr2line instance to be started
Bryce> if there is one already running for that binary.
Yeah, the debug lookup methods could be static, even, since they deal
with the application as a whole. There really isn't any reason to
create an instance per exception, is there?
Bryce> Caching some data about each .so is presumably much less of a
Bryce> problem than keeping around a huge (50MB+ !) addr2line
Bryce> instance, so we could probably just keep the namefinder
Bryce> instance around for the lifetime of the application, after the
Bryce> first printStackTrace() call? Perhaps we can keep the mmap'ed
Bryce> .debug_line sections open too - presumably they can be mapped
Bryce> read-only, so the overall effect on application memory usage
Bryce> shouldn't be too significant?
I like this idea. Presumably the Dwarf2NameFinder would cache
something like a mapping between PC address ranges to regions in the
.debug_line sections? Much of the slowness I see right now is because
looking up an address has to start at the beginning of the section for
that library, which is monstrously slow for a library as large as
libgcj.
>> I could rewrite this part if need be, since I have a better grasp
>> of ELF than when I started this.
>>Bryce> I think this may be necessary, unfortunately.
Ok.
-- 
Casey Marshall || csm@gnu.org