head 3.1; access; symbols; locks; strict; comment @.\" @; 3.1 date 95.09.11.09.51.31; author grog; state Exp; branches; next 3.0; 3.0 date 95.06.25.10.55.49; author grog; state Exp; branches; next 2.8; 2.8 date 95.06.25.10.55.49; author grog; state Exp; branches; next 2.7; 2.7 date 95.06.24.11.45.14; author grog; state Exp; branches; next 2.6; 2.6 date 95.06.09.04.31.13; author grog; state Exp; branches; next 2.5; 2.5 date 95.05.17.17.33.19; author grog; state Exp; branches; next 2.4; 2.4 date 95.02.22.14.24.21; author grog; state Exp; branches; next 2.3; 2.3 date 95.02.03.13.25.50; author grog; state Exp; branches; next 2.2; 2.2 date 95.01.25.14.34.05; author grog; state Exp; branches; next 2.1; 2.1 date 95.01.25.13.34.27; author grog; state Exp; branches; next 2.0; 2.0 date 94.12.21.16.58.30; author grog; state Exp; branches; next ; desc @@ 3.1 log @Fix typos @ text @.\" For emacs, this file is in -*- nroff-fill -*- mode .\" $Id: testing.ms,v 3.0 1995年06月25日 10:55:49 grog Exp grog $ .\" $Log: testing.ms,v $ .\" Revision 3.0 1995年06月25日 10:55:49 grog .\" Final draft .\" .\" Revision 2.8 1995年06月25日 10:55:49 grog .\" Final draft, second cut .\" .\" Revision 2.7 1995年06月24日 11:45:14 grog .\" Final draft, first cut. .\" .\" Revision 2.6 1995年06月09日 04:31:13 grog .\" Remove date from page headers .\" Minor mods .\" .\" Revision 2.5 1995年05月17日 17:33:19 grog .\" Major mods after Andy's final draft review .\" .\" Revision 2.4 1995年02月22日 14:24:21 grog .\" Minor mods .\" .\" Revision 2.3 1995年02月03日 13:25:50 grog .\" Mods after Andy's review .\" .\" Revision 2.2 1995年01月25日 14:34:05 grog .\" Minor mods .\" .\" Revision 2.1 1995年01月25日 13:34:27 grog .\" Minor mods .\" .\" .so global.ms .Se \*[nchtest] "Testing the results" .St "Testing" Finally \fImake\fR has run through to the end and has not reported errors. Your source tree now contains all the objects and executables. You're done! .LP After a brief moment of euphoria, you sit down at the keyboard and start the program: .Ps $ \f(CBxterm\f(CW Segmentation fault - core dumped .Pe Well, maybe you're not quite done after all. Occasionally the program does not work as advertised. What you do now depends on how much programming experience you have. If you are a complete beginner, you could be in trouble--about the only thing you can do (apart from asking somebody else) is to go back and check that you really did configure the package correctly. .LP On the other hand, if you have even a slight understanding of programming, you should try to analyze the cause of the error--it's easier than you think. Hold on, and try not to look down. .LP There are thousands of possible reasons for the problems you encounter when you try to run a buggy executable, and lots of good books explain debugging techniques. In this chapter, we will touch only on aspects of debugging that relate to porting. First we'll attack a typical, if somewhat involved, real-life bug, and solve it, discussing the pros and cons on the way. Then we'll look at alternatives to traditional debuggers: kernel and network tracing. .LP Before you even start your program, of course, you should check if any test programs are available. Some packages include their own tests, and separate test suites are available for others. For other packages there may be test suites that were not designed for the package, but that can be used with it. If there are any tests, you should obviously run them. You might also consider writing some tests and including them as a target \s10\f(CWtest\fR\s0 in the \fIMakefile\fR. .Bh "What makes ported programs fail?" .XX "failure, causes" Ported programs don't normally fail for the same reasons as programs under development. A program under development still has bugs that prevent it from running correctly on any platform, while a ported program has already run reasonably well on some other platform. If it doesn't run on your platform, the reasons are usually: .Ls B .Li A latent bug has found more fertile feeding ground. For example, a program may read from a null pointer. This frequently doesn't get noticed if the data at address 0 doesn't cause the program to do anything unusual. On the other hand, if the new platform does not have any memory mapped at address 0, it will cause a segmentation violation or a bus error. .\" XXX Do we need this? .\" \** .\" .FS .\" See \*[chsignal], page \*[SIGBUSEGV], for more details. .\" .FE .Li Differences in the implementation of library functions or kernel functionality cause the program to behave differently in the new environment. For example, the function \s10\f(CWsetpgrp\fR\s0 has completely different semantics under System V and under BSD. See \*[chkdepend], page \*[setpgrp], for more details. .Li The configuration scripts have never been adequately tested for your platform. As a result, the program contains bugs that were not in the original versions. .Le .Ah "A strategy for testing" .XX "strategy, for testing" .XX "testing, strategy" When you write your own program with its own bugs, it helps to understand exactly what the program is trying to do: if you sit back and think about it, you can usually shorten the debugging process. When debugging software that you have just ported, the situation is different: you \fIdon't\fR understand the package, and learning its internals could take months. You need to find a way to track down the bug without getting bogged down with the specifics of how the package works. .LP .XX "xterm" You can overdo this approach, of course. It still helps to know what the program is trying to do. For example, when \fIxterm\fR dies, it's nice to know roughly how \fIxterm\fR works: it opens a window on an X server and emulates a terminal in this window. If you know something about the internals of X11, this will also be of use to you. But it's not time-effective to try to fight your way through the source code of \fIxterm\fR. .LP .XX "GIGO" .XX "garbage in, garbage out" In the rest of this chapter, we'll use this bug (yes, it was a real live bug in X11R6) to look at various techniques that you can use to localize and finally pinpoint the problem. The principle we use is the old \fIGIGO\fR principle--garbage in, garbage out. We'll subdivide the program into pieces which we can conveniently observe, and check which of them does not produce the expected output. After we find the piece with the error, we subdivide it further and repeat the process until we find the bug. The emphasis in this method is on \fIconvenient\fR: it doesn't necessarily have to make sense. As long as you can continue to divide your problem area into between two and five parts and localize the problem in one of the parts, it won't take long to find the bug. .LP So what's a convenient way to look at the problems? That depends on the tools you have at your disposal: .Ls B .Li If you have a symbolic debugger, you can divide your problem into the individual functions and examine what goes in and what goes out. .Li If you have a system call trace program, such as \fIktrace\fR or \fItruss\fR, you can monitor what the program says to the system and what the system replies. .Li If you have a communications line trace program, you can try to divide your program into pieces that communicate across this line, so you can see what they are saying to each other. .Le Of course, we have all these things. In the following sections we'll look at each of them in more detail. .Ah "Symbolic debuggers" .XX "symbolic debugger" .XX "debugger, symbolic" .XX "gdb" .XX "pyramid" If you don't have a symbolic debugger, get one. Now. Many people still claim to be able to get by without a debugger, and it's horrifying how many people don't even know how to use one. Of course you can debug just about anything without a symbolic debugger. Historians tell us that you can build pyramids without wheels--that's a comparable level of technology to testing without a debugger. The GNU debugger, \fIgdb\fR, is available on just about every platform you're likely to encounter, and though it's not perfect, it runs rings around techniques like putting \fIprintf\fR statements in your programs. .LP .XX "attach, debugger" .XX "debugger, attach" In UNIX, a debugger is a process that takes control of the execution of another process. Most versions of UNIX allow only one way for the debugger to take control: it must start the process that it debugs. Some versions, notably SunOS 4, but not Solaris 2, also allow the debugger to \fIattach\fR to a running process. .LP Whichever debugger you use, there are a surprisingly small number of commands that you need. In the following discussion, we'll look at the command set of \fIgdb\fR, since it is widely used. The commands for other symbolic debuggers vary considerably, but they normally have similar purposes. .Ls B .Li .XX "where am I?" .XX "how did I get here?" .XX "stack trace" A \fIstack trace\fR command answers the question, "Where am I, and how did I get here?", and is almost the most useful of all commands. It's certainly the first thing you should do when examining a core dump or after getting a signal while debugging the program. \fIgdb\fR implements this function with the \s10\f(CWbacktrace\fR\s0 command. .Li \fIDisplaying data\fR is the most obvious requirement: what is the current value of the variable \s10\f(CWbar\fR\s0? In \fIgdb\fR, you do this with the \s10\f(CWprint\fR\s0 command. .Li \fIDisplaying register contents\fR is really the same thing as displaying program data. In \fIgdb\fR, you display individual registers with the \s10\f(CWprint\fR\s0 command, or all registers with the \s10\f(CWinfo registers\fR\s0 command. .Li \fIModifying data and register contents\fR is an obvious way of modifying program execution. In \fIgdb\fR, you do this with the \s10\f(CWset\fR\s0 command. .XX "breakpoint, debugger" .XX "debugger, breakpoint" .Li \fIbreakpoints\fR stop execution of the process when the process attempts to execute an instruction at a certain address. \fIgdb\fR sets breakpoints with the \s10\f(CWbreak\fR\s0 command. .Li .XX "watchpoint, debugger" .XX "debugger, watchpoint" Many modern machines have hardware support for more sophisticated breakpoint mechanisms. For example, the i386 architecture can support four hardware breakpoints on instruction fetch (in other words, traditional breakpoints), memory read or memory write. These features are invaluable in systems that support them; unfortunately, UNIX usually does not. \fIgdb\fR simulates this kind of breakpoint with a so-called \fIwatchpoint\fR. When watchpoints are set, \fIgdb\fR simulates program execution by single-stepping through the program. When the condition (for example, writing to the global variable \s10\f(CWfoo\fR\s0) is fulfilled, the debugger stops the program. This slows down the execution speed by several orders of magnitude, whereas a real hardware breakpoint has no impact on the execution speed.\** .FS Some architectures slow the overall execution speed slightly in order to test the hardware registers. This effect is negligible. .FE .Li .XX "program counter" .XX "instruction pointer" \fIJumping\fR (changing the address from which the next instruction will be read) is really a special case of modifying register contents, in this case the \fIprogram counter\fR (the register that contains the address of the next instruction). This register is also sometimes called the \fIinstruction pointer\fR, which makes more sense. In \fIgdb\fR, use the \s10\f(CWjump\fR\s0 command to do this. Use this instruction with care: if the compiler expects the stack to look different at the source and at the destination, this can easily cause incorrect execution. .Li .XX "single stepping, in debugger" .XX "debugger, single stepping" \fISingle stepping\fR in its original form is supported in hardware by many architectures: after executing a single instruction, the machine automatically generates a hardware interrupt that ultimately causes a \s10\f(CWSIGTRAP\fR\s0 signal to the debugger. \fIgdb\fR performs this function with the \s10\f(CWstepi\fR\s0 command. .Li You won't want to execute individual machine instructions until you are in deep trouble. Instead, you will execute a \fIsingle line\fR instruction, which effectively single steps until you leave the current line of source code. To add to the confusion, this is also frequently called \fIsingle stepping\fR. This command comes in two flavours, depending on how it treats function calls. One form will execute the function and stop the program at the next line after the call. The other, more thorough form will stop execution at the first executable line of the function. It's important to notice the difference between these two functions: both are extremely useful, but for different things. \fIgdb\fR performs single line execution omitting calls with the \s10\f(CWnext\fR\s0 command, and includes calls with the \s10\f(CWstep\fR\s0 command. .Le There are two possible approaches when using a debugger. The easier one is to wait until something goes wrong, then find out where it happened. This is appropriate when the process gets a signal and does not overwrite the stack: the \s10\f(CWbacktrace\fR\s0 command will show you how it got there. .LP Sometimes this method doesn't work well: the process may end up in no-man's-land, and you see something like: .Ps Program received signal SIGSEGV, Segmentation fault. 0x0 in ?? () (gdb) \f(CBbt\f(CW \fI\&abbreviation for \f(BIbacktrace\f(CW #0 0x0 in ?? () \fI\&nowhere\f(CW (gdb) .Pe Before dying, the process has mutilated itself beyond recognition. Clearly, the first approach won't work here. In this case, we can start by conceptually dividing the program into a number of parts: initially we take the function \s10\f(CWmain\fR\s0 and the set of functions which \s10\f(CWmain\fR\s0 calls. By single stepping over the function calls until something blows up, we can localize the function in which the problem occurs. Then we can restart the program and single step through this function until we find what it calls before dying. This iterative approach sounds slow and tiring, but in fact it works surprisingly well. .Ah "Libraries and debugging information" .XX "libraries, debugging information" .XX "debugging information, in libraries" .XX "xterm" Let's come back to our \fIxterm\fR program and use \fIgdb\fR to figure out what is going on. We could, of course, look at the core dump, but in this case we can repeat the problem at will, so we're better off looking at the live program. We enter: .Pn first-xterm-death .Ps $ \f(CBgdb xterm\f(CW \fI(political statement for the FSF omitted)\f(CW (gdb) r -display allegro:0 \fI\&run the program\f(CW Starting program: /X/X11/X11R6/xc/programs/xterm/xterm -display allegro:0 Program received signal SIGBUS, Bus error. 0x3b0bc in _XtMemmove () (gdb) bt \fI\&look back down the stack\f(CW #0 0x3b0bc in _XtMemmove () \fI\&all these functions come from the X toolkit\f(CW #1 0x34dcd in XtScreenDatabase () #2 0x35107 in _XtPreparseCommandLine () #3 0x4e2ef in XtOpenDisplay () #4 0x4e4a1 in _XtAppInit () #5 0x35700 in XtOpenApplication () #6 0x357b5 in XtAppInitialize () #7 0x535 in main () (gdb) .Pe The stack trace shows that the main program called \s10\f(CWXtAppInitialize\fR\s0, and the rest of the stack shows the program deep in the X Toolkit, one of the central X11 libraries. If this were a program that you had just written, you could expect it to be a bug in your program. In this case, where we have just built the complete X11 core system, there's also every possibility that it is a library bug. As usual, the library was compiled without debug information, and without that you hardly have a hope of finding it. .LP Apart from size constraints, there is no reason why you can't include debugging information in a library. The object files in libraries are just the same as any others--we discuss them in detail on page \*[libdef]. If you want, you can build libraries with debugging information, or you can take individual library routines and compile them separately. .LP .XX "libXt.a" Unfortunately, the size constraints are significant: without debugging information, the file \fIlibXt.a\fR is about 330 kB long and contains 53 object files. With debugging information, it might easily reach 20 MB, since all the myriad X11 global symbols would be included with each object file in the archive. It's not just a question of disk space: you also need virtual memory during the link phase to accommodate all these symbols. Most of these files don't interest us anyway: the first one that does is the one that contains \s10\f(CW_XtMemmove\fR\s0. So we find where it is and compile it alone with debugging information. .LP .XX "X Toolkit" That's not as simple as it sounds: first we need to find the source file, and to do that we need to find the source directory. We could read the documentation, but to do that we need to know that the \fIXt\fR functions are in fact the X toolkit. If we're using GNU \fImake\fR, or if our \fIMakefile\fR documents directory changes, an alternative would be to go back to our \fImake\fR log and look for the text \fIXt\fR. If we do this, we quickly find .Ps make[4]: Leaving directory `/X/X11R6/xc/lib/Xext' making Makefiles in lib/Xt... mv Makefile Makefile.bak make[4]: Entering directory `/X/X11R6/xc/lib/Xt' make[4]: Nothing to be done for `Makefiles'. make[4]: Leaving directory `/X/X11R6/xc/lib/Xt' .Pe .XX "XtMemmove, function" .XX "function, XtMemmove" So the directory is \fI/X/X11R6/xc/lib/Xt\fR. The next step is to find the file that contains \s10\f(CWXtMemmove\fR\s0. There is a possibility that it is called \fIXtMemmove.c\fR, but in this case there is no such file. We'll have to grep for it. Some versions of \fIgrep\fR have an option to descend recursively into subdirectories, which can be very useful if you have one available. Another useful tool is \fIcscope\fR, which is supplied with System V. .\" Thanks for cscope, Jox. .Ps $ \f(CBgrep XtMemmove *.c\f(CW Alloc.c:void _XtMemmove(dst, src, length) Convert.c: XtMemmove(&p->from.addr, from->addr, from->size); \fI\&... many more references to XtMemmove\f(CW .Pe So \s10\f(CWXtMemmove\fR\s0 is in \fIAlloc.c\fR. By the same method, we look for the other functions mentioned in the stack trace and discover that we also need to recompile \fIInitialize.c\fR and \fIDisplay.c\fR. .XX "Alloc.c" .XX "Initialize.c" .XX "Display.c" .LP In order to compile debugging information, we add the compiler option \s10\f(CW-g\fR\s0. At the same time, we remove \s10\f(CW-O\fR\s0. \fIgcc\fR doesn't require this, but it's usually easier to debug a non-optimized program. We have three choices of how to set the options: .Ls B .Li .XX "make, World target" .XX "World, make target" We can modify the \fIMakefile\fR (\fImake World\fR, the main \fImake\fR target for X11, rebuilds the \fIMakefile\fRs from the corresponding \fIImakefile\fRs, so this is not overly dangerous). .Li If we have a working version of \fIxterm\fR, we can use its facilities: first we start the compilation with \fImake\fR, but we don't need to wait for the compilation to complete: as soon as the compiler invocation appears on the screen, we abort the build with \s10\f(CWCTRL-C\fR\s0. Using the \fIxterm\fR copy function, we copy the compiler invocation to the command line and add the options we want: .\" XXX I don't like the way this looks. I don't seem to be getting f(BI below, .\" and it doesn't look too good anyway. Any ideas? How about 'reverse video' .\" for the marked-up stuff? .Ps $ \f(CBrm Alloc.o Initialize.o Display.o\f(CW \fI\&remove the old objects\f(CW $ \f(CBmake\f(CW \fI\&and start make normally\f(CW rm -f Alloc.o \f(BIgcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -I../.. \e -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Alloc.c\f(CW ^C \fIinterrupt make with CTRL-C\f(CW make: *** [Alloc.o] Interrupt \fIcopy the invocation lines above with the mouse, and paste below, then modify as shown in bold print\f(CW $ gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -I../.. \e -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Alloc.c \f(CB-g\f(CW .Pe You can also use \s10\f(CWmake -n\fR\s0, which just shows the commands that \s10\f(CWmake\fR\s0 would execute, rather than aborting the \s10\f(CWmake\fR\s0, but you frequently find that \s10\f(CWmake -n\fR\s0 prints out a whole lot of stuff you don't expect. When you have made \fIAlloc.o\fR, you can repeat the process for the other two object files. .Li We could change \s10\f(CWCFLAGS\fR\s0 from the \fImake\fR command line. Our first attempt doesn't work too well, though. If you compare the following line with the invocation above, you'll see that a whole lot of options are missing. They were all in \s10\f(CWCFLAGS\fR\s0; by redefining \s10\f(CWCFLAGS\fR\s0, we lose them all: .Ps $ \f(CBmake CFLAGS=-g\f(CW rm -f Alloc.o gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g Alloc.c .Pe .IP \s10\f(CWCFLAGS\fR\s0 included all the compiler options starting from \s10\f(CW-I/../..\fR\s0, so we need to write: .Ps $ \f(CBmake CFLAGS='-g -c -I../.. -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL'\f(CW .Pe .Le .XX "Alloc.o" .XX "Initialize.o" .XX "Display.o" When we have created all three new object files, we can let \fImake\fR complete the library for us. It will not try to remake these object files, since now they are newer than any of their dependencies: .Ps $ \f(CBmake\f(CW \fI\&run make to build a new library\f(CW rm -f libXt.a ar clq libXt.a ActionHook.o Alloc.o ArgList.o Callback.o ClickTime.o Composite.o \e Constraint.o Convert.o Converters.o Core.o Create.o Destroy.o Display.o Error.o \e Event.o EventUtil.o Functions.o GCManager.o Geometry.o GetActKey.o GetResList.o \e GetValues.o HookObj.o Hooks.o Initialize.o Intrinsic.o Keyboard.o Manage.o \e NextEvent.o Object.o PassivGrab.o Pointer.o Popup.o PopupCB.o RectObj.o \e Resources.o Selection.o SetSens.o SetValues.o SetWMCW.o Shell.o StringDefs.o \e Threads.o TMaction.o TMgrab.o TMkey.o TMparse.o TMprint.o TMstate.o VarCreate.o \e VarGet.o Varargs.o Vendor.o ranlib libXt.a rm -f ../../usrlib/libXt.a cd ../../usrlib; ln ../lib/Xt/libXt.a . $ .Pe Now we have a copy of the X Toolkit in which these three files have been compiled with symbols. Next, we need to rebuild \fIxterm\fR. That's straightforward enough: .Ps $ \f(CBcd ../../programs/xterm/\f(CW $ \f(CBpwd\f(CW /X/X11R6/xc/programs/xterm $ \f(CBmake\f(CW rm -f xterm gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -fwritable-strings -o xterm \e -L../../usrlib main.o input.o charproc.o cursor.o util.o tabs.o screen.o \e scrollbar.o button.o Tekproc.o misc.o VTPrsTbl.o TekPrsTbl.o data.o menu.o -lXaw \e -lXmu -lXt -lSM -lICE -lXext -lX11 -L/usr/X11R6/lib -lpt -ltermlib .Pe .LP Finally, we try again. Since the library is not in the current directory, we use the \s10\f(CWdir\fR\s0 command to tell \fIgdb\fR where to find the sources. Now we get: .Ps $ \f(CBgdb xterm\f(CW (gdb) \f(CBdir ../../lib/X11\f(CW \fI\&set source paths\f(CW Source directories searched: /X/X11/X11R6/xc/programs/xterm/../../lib/X11:$cdir:$cwd (gdb) \f(CBdir ../../lib/Xt\f(CW Source directories searched: /X/X11/X11R6/xc/programs/xterm/../../lib/Xt/X/X11/X11R6/xc/programs/xterm/../..\e /lib/X11:$cdir:$cwd (gdb) \f(CBr\f(CW \fI\&and run the program\f(CW Starting program: /X/X11/X11R6/xc/programs/xterm/xterm Program received signal SIGBUS, Bus error. 0x3ced6 in _XtMemmove (dst=0x342d8 "\(-DE003円", src=0x41c800 "", length=383) \e at Alloc.c:101 101 *dst++ = *src++; (gdb) .Pe This shows a typical byte for byte memory move. About the only thing that could cause a bus error on that statement would be an invalid address, but the parameters show that they appear to be valid. .LP There are at two possible gotchas here: .Ls B .Li The debugger may be lying. The parameters it shows are the parameters on the stack. If the code has been optimized, there is a very good chance that the source and destination addresses are stored in registers, and thus the value of \s10\f(CWdst\fR\s0 on the stack is not up to date. .Li The destination address may be in the text segment, in which case an attempt to write to it will cause some kind of error. Depending on the system it could be a segmentation violation or a bus error. .Le The most reliable way to find out what is really going on is to look at the machine instructions being executed. First we tell the debugger to look at current instruction and the following five instructions: .Ps (gdb) \f(CBx/6i $eip\f(CW \fI\&list the next 6 instructions\f(CW 0x3ced6 <_xtmemmove+74>: movb %al,(%edx) 0x3ced8 <_xtmemmove+76>: incl 0xc(%ebp) 0x3cedb <_xtmemmove+79>: incl 0x8(%ebp) 0x3cede <_xtmemmove+82>: jmp 0x3cec2 <_xtmemmove+54> 0x3cee0 <_xtmemmove+84>: leave 0x3cee1 <_xtmemmove+85>: ret .Pe The first instruction is a byte move, from register \s10\f(CWal\fR\s0 to the address stored in register \s10\f(CWedx\fR\s0. Let's look at the address in \s10\f(CWedx\fR\s0: .Ps (gdb) \f(CBp/x $edx\f(CW 9ドル = 0x342d8 .Pe Well, this is our \s10\f(CWdst\fR\s0 address alright--why can't it store there? It would be nice to be able to try to set values in memory and see if the debugger can do it: .Ps (gdb) \f(CBset *dst = 'X'b\f(CW (gdb) \f(CBp *dst\f(CW 13ドル = 88 'X' .Pe That looks writable enough. Unfortunately, you can't rely on the debugger to tell the truth. Debuggers must be able to write to the text segment. If the write had failed, you could have been sure that the address was not writable, but if the write succeeds, you can't be sure. What we need to know are the exact segment limits. Some debuggers show you the segment limits, but current versions of \fIgdb\fR do not. An alternative is the \fIsize\fR command: .XX "size" .Ps $ \f(CBsize xterm\f(CW text data bss dec hex filename 846204 56680 23844 926728 e2408 xterm .Pe The text segment is 846204 decimal bytes long (0xce97c), and on this system (SCO UNIX) it starts at address 0, so the address is, indeed, in the text segment. But where did it come from? To find an answer to that question, we need to look at the calling function. In \fIgdb\fR, we do this with the \s10\f(CWframe\fR\s0 command: .Ps (gdb) \f(CBf 1 \fIlook at the calling function (frame 1)\f(CW #1 0x35129 in _MergeOptionTables (src1=0x342d8, num_src1=24, src2=0x400ffe, num_src2=64, dst=0x7ffff9c0, num_dst=0x7ffff9bc) at Initialize.c:602 602 (void) memmove(table, src1, sizeof(XrmOptionDescRec) * num_src1 ); .Pe That's funny--last time it died, the function was called from \f(CWXtScreenDatabase\fR,\** .FS See frame 1 in the stack trace on page \*[first-xterm-death]. .FE not from \f(CW_MergeOptionTables\fR. Why? At the moment it's difficult to say for sure, but it's possible that this difference happened because we removed optimization. In any case, we still have a problem, so we should fix this one first and then go back and look for the other one if solving this problem isn't enough. .LP In this case, the \fIframe\fR command doesn't help much, but it does tell us that the destination variable is called \s10\f(CWtable\fR\s0, and implicitly that \s10\f(CWmemmove\fR\s0 has been defined as \f(CW_XtMemmove\fR in this source file. We could now look at the source file in an editor in a different X window, but it's easier to list the instructions around the current line with the \f(CWlist\fR command: .Ps (gdb) \f(CBl\f(CW 597 enum {Check, NotSorted, IsSorted} sort_order = Check; 598 599 *dst = table = (XrmOptionDescRec*) 600 XtMalloc( sizeof(XrmOptionDescRec) * (num_src1 + num_src2) ); 601 602 (void) memmove(table, src1, sizeof(XrmOptionDescRec) * num_src1 ); 603 if (num_src2 == 0) { 604 *num_dst = num_src1; 605 return; 606 } .Pe So, the address is returned by the function \f(CWXtMalloc\fR--it seems to be allocating storage in the text segment. At this point, we could examine it more carefully, but let's first be sure that we're looking at the right problem. The address in \s10\f(CWtable\fR\s0 should be the same as the address in the parameter \s10\f(CWdst\fR\s0 of \s10\f(CWXtMemmove\fR\s0. We're currently examining the environment of \s10\f(CW_MergeOptionTables\fR\s0, so we can look at it directly: .Ps (gdb) \f(CBp table\f(CW 29ドル = (XrmOptionDescRec *) 0x41c800 .Pe That looks just fine. Where did this strange \s10\f(CWdst\fR\s0 address come from? Let's set a breakpoint on the call to \s10\f(CWmemmove\fR\s0 on line 602, and then restart the program: .Xs (gdb) \f(CBb 602\f(CW Breakpoint 8 at 0x35111: file Initialize.c, line 602. (gdb) \f(CBr\f(CW The program being debugged has been started already. Start it from the beginning? (y or n) \f(CBy\f(CW Starting program: /X/X11/X11R6/xc/programs/xterm/xterm Breakpoint 8, _MergeOptionTables (src1=0x342d8, num_src1=24, src2=0x400ffe, num_src2=64, dst=0x7ffff9c0, num_dst=0x7ffff9bc) at Initialize.c:602 602 (void) memmove(table, src1, sizeof(XrmOptionDe (gdb) \f(CBp table\f(CW \fI\&look again, to be sure\f(CW 31ドル = (XrmOptionDescRec *) 0x41c800 (gdb) \f(CBs\f(CW \fI\&single step into memmove\f(CW _XtMemmove (dst=0x342d8 "\(-DE003円", src=0x41c800 "", length=384) at Alloc.c:94 94 if (src < dst) { .Xe This is really strange! \s10\f(CWtable\fR\s0 has a valid address in the data segment, but the address we pass to \s10\f(CW_XtMemmove\fR\s0 is in the text segment and seems unrelated. It's not clear what we should look at next: .Ls B .Li The source of the function calls \s10\f(CWmemmove\fR\s0, but after preprocessing it ends up calling \s10\f(CW_XtMemmove\fR\s0. \s10\f(CWmemmove\fR\s0 might simply be defined as \s10\f(CW_XtMemmove\fR\s0, but it might also be defined with parameters, in which case some subtle type conversions might result in our problem. .Li If you understand the assembler of the system, it might be instructive to look at the actual instructions that the compiler produces. .Le It's definitely quicker to look at the assembler instructions than to fight your way through the thick undergrowth in the X11 source tree: .Ps (gdb) \f(CBx/8i $eip\f(CW \fI\&look at the next 8 instructions\f(CW 0x35111 <_mergeoptiontables+63>: movl 0xc(%ebp),%edx 0x35114 <_mergeoptiontables+66>: movl %edx,0xffffffd8(%ebp) 0x35117 <_mergeoptiontables+69>: movl 0xffffffd8(%ebp),%edx 0x3511a <_mergeoptiontables+72>: shll 0ドルx4,%edx 0x3511d <_mergeoptiontables+75>: pushl %edx 0x3511e <_mergeoptiontables+76>: pushl 0xfffffffc(%ebp) 0x35121 <_mergeoptiontables+79>: pushl 0x8(%ebp) 0x35124 <_mergeoptiontables+82>: call 0x3ce8c <_xtmemmove> .Pe This isn't easy stuff to handle, but it's worth understanding, so we'll pull it apart, instruction for instruction. It's easier to understand this discussion if you refer to the diagrams of stack structure in \*[chobj], page \*[complete-stack]. .Ls B .Li \s10\f(CWmovl 0xc(%ebp),%edx\fR\s0 takes the content of the stack word offset 12 in the current stack frame and places it in register \s10\f(CWedx\fR\s0. As we have seen, this is \s10\f(CWnum_src1\fR\s0, the second parameter passed to \s10\f(CW_MergeOptionTables\fR\s0. .Li \s10\f(CWmovl %edx,0xffffffd8(%ebp)\fR\s0 stores the value of \s10\f(CWedx\fR\s0 at offset -40 in the current stack frame. This is for temporary storage. .Li \s10\f(CWmovl 0xffffffd8(%ebp),%edx\fR\s0 does \fIexactly\fR the opposite: it loads register \s10\f(CWedx\fR\s0 from the location where it just stored it. These two instructions are completely redundant. They are also a sure sign that the function was compiled without optimization. .Li \s10\f(CWshll 0ドルx4,%edx\fR\s0 shifts the contents of register \s10\f(CWedx\fR\s0 left by 4 bits, multiplying it by 16. If we compare this to the source, it's evident that the value of \s10\f(CWXrmOptionDescRec\fR\s0 is 16, and that the compiler has taken a short cut to evaluate the third parameter of the call. .Li \s10\f(CWpushl %edx\fR\s0 pushes the contents of \s10\f(CWedx\fR\s0 onto the stack. .Li \s10\f(CWpushl 0xfffffffc(%ebp)\fR\s0 pushes the value of the word at offset -4 in the current stack frame onto the stack. This is the value of \s10\f(CWtable\fR\s0, as we can confirm by looking at the instructions generated for the previous line. .Li \s10\f(CWpushl 0x8(%ebp)\fR\s0 pushes the value of the first parameter, \s10\f(CWsrc1\fR\s0, onto the stack. .Li Finally, \s10\f(CWcall _XtMemmove\fR\s0 calls the function. Expressed in C, we now know that it calls .Ps memmove (src1, table, num_src1 << 4); .Pe .Le This is, of course, wrong: the parameter sequence of source and destination has been reversed. Let's look at \s10\f(CW_XtMemmove\fR\s0 more carefully: .Ps (gdb) \f(CBl _XtMemmove\f(CW 89 #ifdef _XNEEDBCOPYFUNC 90 void _XtMemmove(dst, src, length) 91 char *dst, *src; 92 int length; 93 { 94 if (src < dst) { 95 dst += length; 96 src += length; 97 while (length--) 98 *--dst = *--src; 99 } else { 100 while (length--) 101 *dst++ = *src++; 102 } 103 } 104 #endif .Pe Clearly the function parameters are the same as those of \s10\f(CWmemmove\fR\s0, but the calling sequence has reversed them. We've found the problem, but we haven't found what's causing it. .LP \fIAside\fR: Debugging is not an exact science. We've found our problem, though we still don't know what's causing it. But looking back at .Ref e , p we see that the address for \s10\f(CWsrc\fR\s0 on entering \s10\f(CW_XtMemmove\fR\s0 was the same as the address of \s10\f(CWtable\fR\s0. That tells us as much as analyzing the machine code did. This will happen again and again: after you find a problem, you discover you did it the hard way. .LP The next thing we need to figure out is why the compiler reversed the sequence of the parameters. Can this be a compiler bug? Theoretically, yes, but it's very unlikely that such a primitive bug should go undiscovered up to now. .LP Remember that the compiler does not compile the sources you see: it compiles whatever the preprocessor hands to it. It makes a lot of sense to look at the preprocessor output. To do this, we go back to the library directory. Since we used \s10\f(CWpushd\fR\s0, this is easy--just enter \s10\f(CWpushd\fR\s0. In the library, we use the same trick as before in order to run the compiler with different options, only this time we use the options \s10\f(CW-E\fR\s0 (stop after running the preprocessor), \s10\f(CW-dD\fR\s0 (retain the text of the definitions in the preprocessor output), and \s10\f(CW-C\fR\s0 (retain comments in the preprocessor output). In addition, we output to a file \fIjunk.c\fR: .Ps $ \f(CBpushd\f(CW $ \f(CBrm Initialize.o\f(CW $ \f(CBmake Initialize.o\f(CW rm -f Initialize.o gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g -I../.. \\ -D_SVID -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Initialize.c make: *** [Initialize.o] Interrupt \fI\&hit CTRL-C\f(CW \fI\&... copy the command into the command line, and extend:\f(CW $ \f(CWgcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g -I../.. \\ -D_SVID -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Initialize.c \\ \f(CB-E -dD -C>junk.c $ .Pe As you might have guessed, we now look at the file \fIjunk.c\fR with an editor. We're looking for \s10\f(CWmemmove\fR\s0, of course. We find a definition in \fI/usr/include/string.h\fR, then later on we find, in \fI/X/X11/X11R6/xc/X11/Xfuncs.h\fR, .Ps #define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len)) #define memmove(dst,src,len) _XBCOPYFUNC((char *)(src),(char *)(dst),(int)(len)) #define _XNEEDBCOPYFUNC .Pe For some reason, the configuration files have decided that \s10\f(CWmemmove\fR\s0 is not defined on this system, and have replaced it with \s10\f(CWbcopy\fR\s0 (which is really not defined on this system). Then they replace it with the substitute function \s10\f(CW_XBCOPYFUNC\fR\s0, almost certainly a preprocessor definition. It also defines the preprocessor variable \s10\f(CW_XNEEDBCOPYFUNC\fR\s0 to indicate that \s10\f(CW_XtMemmove\fR\s0 should be compiled. .LP Unfortunately, we don't see what happens with \s10\f(CW_XNEEDBCOPYFUNC\fR\s0. The preprocessor discards all \s10\f(CW#ifdef\fR\s0 lines. It does include \s10\f(CW#define\fR\s0s, however, so we can look for where \s10\f(CW_XBCOPYFUNC\fR\s0 is defined--it's in \fIIntrinsicI.h\fR, as the last \s10\f(CW#line\fR\s0 directive before the definition indicates. .Ps #define _XBCOPYFUNC _XtMemmove .Pe \fIIntrinsicI.h\fR also contains a number of definitions for \s10\f(CWXtMemmove\fR\s0, none of which are used in the current environment, but all of which have the parameter sequence \s10\f(CW(dst, src, count)\fR\s0. \s10\f(CWbcopy\fR\s0 has the parameter sequence \s10\f(CW(src, dst, count)\fR\s0. Clearly, somebody has confused something in this header file, and under certain rare circumstances the call is defined with the incorrect parameter sequence. .LP Somewhere in here is a lesson to be learnt: this is a real bug that occurred in X11R6, patch level 3, one of the most reliable and most portable software packages available, yet here we have a really primitive bug. The real problem lies in the configuration mechanism: automated configuration can save a lot of time in normal circumstances, but it can also cause lots of pain if it makes incorrect assumptions. In this case, the environment was unusual: the kernel platform was SCO UNIX, which has an old-fashioned library, but the library was GNU \fIlibc\fR. This caused the assumptions of the configuration mechanism to break down. .LP Let's look more carefully at the part of \fIXfuncs.h\fR where we found the definitions: .Ps /* the new Xfuncs.h */ #if !defined(X_NOT_STDC_ENV) && (!defined(sun) || defined(SVR4)) /* the ANSI C way */ #ifndef _XFUNCS_H_INCLUDED_STRING_H #include #endif #undef bzero #define bzero(b,len) memset(b,0,len) #else /* else X_NOT_STDC_ENV or SunOS 4 */ #if defined(SYSV) || defined(luna) || defined(sun) || defined(__sxg__) #include #define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len)) #if defined(SYSV) && defined(_XBCOPYFUNC) #undef memmove #define memmove(dst,src,len) _XBCOPYFUNC((char *)(src),(char *)(dst),(int)(len)) #define _XNEEDBCOPYFUNC #endif #else /* else vanilla BSD */ #define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len)) #define memcpy(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len)) #define memcmp(b1,b2,len) bcmp((char *)(b1),(char *)(b2),(int)(len)) #endif /* SYSV else */ #endif /* ! X_NOT_STDC_ENV else */ .Pe This is hairy (and incorrect) stuff. It makes its decisions based on the variables \s10\f(CWX_NOT_STDC_ENV\fR\s0, \s10\f(CWsun\fR\s0, \s10\f(CWSVR4\fR\s0, \s10\f(CWSYSV\fR\s0, \s10\f(CWluna\fR\s0, \s10\f(CW__sxg__\fR\s0 and \s10\f(CW_XBCOPYFUNC\fR\s0. These are the decisions: .Ls B .Li If the variable is \fInot\fR defined, it assumes ANSI C, unless this is a pre-SVR4 Sun machine. .Li Otherwise it checks the variables \s10\f(CWSYSV\fR\s0 (for System V.3), \s10\f(CWluna\fR\s0, \s10\f(CWsun\fR\s0 or \s10\f(CW__sxg__\fR\s0. If any of these are set, it includes the file \fImemory.h\fR and defines \fImemmove\fR in terms of \fIbcopy\fR. If \s10\f(CW_XBCOPYFUNC\fR\s0 is defined, it redefines \s10\f(CWmemmove\fR\s0 as \s10\f(CW_XBCOPYFUNC\fR\s0, reversing the parameters as it goes. .Li If none of these conditions apply, it assumes a vanilla BSD machine and defines the functions \s10\f(CWmemmove\fR\s0, \s10\f(CWmemcpy\fR\s0 and \s10\f(CWmemcmp\fR\s0 in terms of the BSD functions \s10\f(CWbcopy\fR\s0 and \s10\f(CWbcmp\fR\s0. .Le There are two errors here: .Ls B .Li The only way that \s10\f(CW_XBCOPYFUNC\fR\s0 is ever defined is as \s10\f(CW_XtMemmove\fR\s0, which does \fInot\fR have the same parameter sequence as \s10\f(CWbcopy\fR\s0--instead, it has the same parameter sequence as \s10\f(CWmemmove\fR\s0. We can fix this part of the header by changing the definition line to .Ps #define memmove(dst,src,len) _XBCOPYFUNC((char *)(dst),(char *)(src),(int)(len)) .Pe .IP or even to .Ps #define memmove _XBCOPYFUNC .Pe .Li There is no reason to assume that this system does not use ANSI C: it's using \fIgcc\fR and GNU \fIlibc.a\fR, both of them very much standard compliant. We need to examine this point in more detail: .Le Going back to our \fIjunk.c\fR, we search for \s10\f(CWX_NOT_STDC_ENV\fR\s0 and find it defined at line 85 of \fI/X/X11/X11R6/xc/X11/Xosdefs.h\fR: .Ps #ifdef SYSV386 #ifdef SYSV #define X_NOT_POSIX #define X_NOT_STDC_ENV #endif #endif .Pe In other words, this bug is likely to occur only with System V.3 implementations on Intel architecture. This is a fairly typical way to make decisions about the system, but it is wrong: \s10\f(CWX_NOT_STDC_ENV\fR\s0 relates to a compiler, not an operating system, but both \s10\f(CWSYSV386\fR\s0 and \s10\f(CWSYSV\fR\s0 define operating system characteristics. At first sight it would seem logical to modify the definitions like this: .Ps #ifdef SYSV386 #ifdef SYSV #ifndef __GNU_LIBRARY__ #define X_NOT_POSIX #endif #ifndef __GNUC__ #define X_NOT_STDC_ENV #endif #endif #endif .Pe This would only define the variables if the library is not GNU \fIlibc\fR or the compiler is not \fIgcc\fR. This is still not correct: the relationship between \s10\f(CW__GNUC__\fR\s0 and \s10\f(CWX_NOT_STDC_ENV\fR\s0 or \s10\f(CW__GNU_LIBRARY__\fR\s0 and \s10\f(CWX_NOT_POSIX\fR\s0 is not related to System V or the Intel architecture. Instead, it makes more sense to backtrack at the end of the file: .Ps #ifdef __GNU_LIBRARY__ #undef X_NOT_POSIX #endif #ifdef __GNUC__ #undef X_NOT_STDC_ENV #endif .Pe Whichever way we look at it, this is a mess. We're applying cosmetic patches to a configuration mechanism which is based in incorrect assumptions. Until some better configuration mechanism comes along, unfortunately, we're stuck with this situation. .Bh "Limitations of debuggers" Debuggers are useful tools, but they have their limitations. Here are a couple which could cause you problems: .Ch "Can't breakpoint beyond fork" .XX "debugging, past fork" .XX "fork, debugging past" .Pn fork-debugging UNIX packages frequently start multiple processes to do the work on hand. Frequently enough, the program that you start does nothing more than to spawn a number of other processes and wait for them to stop. Unfortunately, the \s10\f(CWptrace\fR\s0 interface which debuggers use requires the process to be started by the debugger. Even in SunOS 4, where you can attach the debugger to a process that is already running, there is no way to monitor it from the start. Other systems don't offer even this facility. In some cases you can determine how the process was started and start it with the debugger in the same manner. This is not always possible--for example, many child processes communicate with their parent. .LP Unfortunately, SunOS trace doesn't support tracing through \s10\f(CWfork\fR\s0. \fItruss\fR does it better than \fIktrace\fR. In extreme cases (like debugging a program of this nature on SunOS 4, where there is no support for trace through \s10\f(CWfork\fR\s0), you might find it an advantage to port to a different machine running an operating system such as Solaris 2 in order to be able to test with \fItruss\fR. Of course, Murphy's law says that the bug won't show up under Solaris 2. .Ch "Terminal logs out" The debugger usually shares a terminal with the program being tested. If the program changes the driver configuration, the debugger should change it back again when it gains control (for example, on hitting a breakpoint), and set it back to the way the program set it before continuing. In some cases, however, it can't: if the process has taken ownership of the terminal with a system call like \fIsetsid\fR (see \*[chkdepend], page \*[setsid]), it will no longer have access to the terminal. Under these circumstances, most debuggers crawl into a corner and die. Then the shell in control of the terminal awakes and dies too. If you're running in an \fIxterm\fR, the \fIxterm\fR then stops; if you're running on a glass tty, you will be logged out. .LP The best way out of this dilemma is to start the child process on a different terminal, if your debugger and your hardware configuration support it. To do this with an \fIxterm\fR requires starting a program which just sleeps, so that the window stays open until you can start your test program: .Ps $ \f(CBxterm -e sleep 100000&\f(CW [1] 27013 $ \f(CBps aux|grep sleep\f(CW grog 27025 3.0 0.0 264 132 p6 S+ 1:13PM 0:00.03 grep sleep root 27013 0.0 0.0 1144 740 p6 I 1:12PM 0:00.37 xterm -e sleep 100000 grog 27014 0.0 0.0 100 36 p8 Is+ 1:12PM 0:00.06 sleep 100000 $ \f(CBgdb myprog\f(CW (gdb) \f(CBr < /dev/ttyp8> /dev/ttyp8\f(CW .Pe This example was done on a BSD machine. On a System V machine you will need to use \fIps -ef\fR instead of \fIps aux\fR. First, you start an \fIxterm\fR with \fIsleep\fR as controlling shell (so that it will stay there). With \fIps\fR you grep for the controlling terminal of the \fIsleep\fR process (the third line in the example), and then you start your program with \fIstdin\fR and \fIstdout\fR redirected to this terminal. .Ch "Can't interrupt process" The \fIptrace\fR interface uses the signal \s10\f(CWSIGTRAP\fR\s0 to communicate with the process being debugged. What happens if you block this signal, or ignore it? Nothing--the debugger doesn't work any more. It's bad practice to block \s10\f(CWSIGTRAP\fR\s0, of course, but it can be done. More frequently, though, you'll encounter this problem when a process gets stuck in a signal processing loop and doesn't get round to processing the \s10\f(CWSIGTRAP\fR\s0--precisely one of the times when you would want to interrupt it. My favourite one is the program which had a \s10\f(CWSIGSEGV\fR\s0 handler which went and retried the instruction. Unfortunately, the only signal to which a process in this state will still respond is \s10\f(CWSIGKILL\fR\s0, which doesn't help you much in finding out what's going on. .Ah "Tracing system calls" .XX "tracing, system calls" .XX "system calls, tracing" An alternative approach is to divide the program between system code and user code. Most systems have the ability to trace the parameters supplied to each system call and the results that they return. This is not nearly as good as using a debugger, but it works with all object files, even if they don't have symbols, and it can be very useful when you're trying to figure out why a program doesn't open a specific file. .LP Tracing is a very system-dependent function, and there are a number of different programs to perform the trace: \fItruss\fR runs on System V.4, \fIktrace\fR runs on BSD NET/2 and 4.4BSD derived systems, and \fItrace\fR runs on SunOS 4. They vary significantly in their features. We'll look briefly at each. Other systems supply still other programs--for example, SGI's IRIX operating system supplies the program \fIpar\fR, which offers similar functionality. .XX "par, program" .XX "program, par" .Bh "trace" .XX "trace" \fItrace\fR is a relatively primitive tool supplied with SunOS 4 systems. It can either start a process or attach to an existing process, and it can print summary information or a detailed trace. In particular, it \fIcannot\fR trace the child of a \s10\f(CWfork\fR\s0 call, which is a great disadvantage. Here's an example of \fItrace\fR output with a possibly recognizable program: .Ps $ \f(CBtrace hello\f(CW open ("/usr/lib/ld.so", 0, 040250) = 3 read (3, "".., 32) = 32 mmap (0, 40960, 0x5, 0x80000002, 3, 0) = 0xf77e0000 mmap (0xf77e8000, 8192, 0x7, 0x80000012, 3, 32768) = 0xf77e8000 open ("/dev/zero", 0, 07) = 4 getrlimit (3, 0xf7fff488) = 0 mmap (0xf7800000, 8192, 0x3, 0x80000012, 4, 0) = 0xf7800000 close (3) = 0 getuid () = 1004 getgid () = 1000 open ("/etc/ld.so.cache", 0, 05000100021) = 3 fstat (3, 0xf7fff328) = 0 mmap (0, 4096, 0x1, 0x80000001, 3, 0) = 0xf77c0000 close (3) = 0 open ("/opt/lib/gcc-lib/sparc-sun-sunos".., 0, 01010525) = 3 fstat (3, 0xf7fff328) = 0 getdents (3, 0xf7800108, 4096) = 212 getdents (3, 0xf7800108, 4096) = 0 close (3) = 0 open ("/opt/lib", 0, 056) = 3 getdents (3, 0xf7800108, 4096) = 264 getdents (3, 0xf7800108, 4096) = 0 close (3) = 0 open ("/usr/lib/libc.so.1.9", 0, 023170) = 3 read (3, "".., 32) = 32 mmap (0, 458764, 0x5, 0x80000002, 3, 0) = 0xf7730000 mmap (0xf779c000, 16384, 0x7, 0x80000012, 3, 442368) = 0xf779c000 close (3) = 0 open ("/usr/lib/libdl.so.1.0", 0, 023210) = 3 read (3, "".., 32) = 32 mmap (0, 16396, 0x5, 0x80000002, 3, 0) = 0xf7710000 mmap (0xf7712000, 8192, 0x7, 0x80000012, 3, 8192) = 0xf7712000 close (3) = 0 close (4) = 0 getpagesize () = 4096 brk (0x60d8) = 0 brk (0x70d8) = 0 ioctl (1, 0x40125401, 0xf7ffea8c) = 0 write (1, "Hello, World!\n", 14) = Hello, World! 14 close (0) = 0 close (1) = 0 close (2) = 0 exit (1) = ? .Pe What's all this output? All we did was a simple write, but we have performed a total of 43 system calls. This shows in some detail how much the viewpoint of the world differs when you're on the other side of the system library. This program, which was run on a SparcStation 2 with SunOS 4.1.3, first sets up the shared libraries (the sequences of \s10\f(CWopen\fR\s0, \s10\f(CWread\fR\s0, \s10\f(CWmmap\fR\s0, and \s10\f(CWclose)\fR\s0, then initializes the \s10\f(CWstdio\fR\s0 library (the calls to \s10\f(CWgetpagesize\fR\s0, \s10\f(CWbrk\fR\s0, \s10\f(CWioctl\fR\s0, and \s10\f(CWfstat\fR\s0), and finally writes to \fIstdout\fR and exits. It also looks strange that it closed \fIstdin\fR before writing the output text: again, this is a matter of perspective. The \s10\f(CWstdio\fR\s0 routines buffer the text, and it didn't actually get written until the process exited, just before closing \fIstdout\fR. .Bh "ktrace" .XX "ktrace" .XX "ktrace.out, file" .XX "file, ktrace.out" .XX "kdump" \fIktrace\fR is supplied with newer BSD systems. Unlike the other trace programs, it writes unformatted data to a log file (by default, \fIktrace.out\fR), and you need to run another program, \fIkdump\fR, to display the log file. It has the following options: .Ls B .Li It can trace the descendents of the process it is tracing. This is particularly useful when the bug occurs in large complexes of processes, and you don't even know which process is causing the problem. .Li It can attach to processes that are already running. Optionally, it can also attach to existing children of the processes to which it attaches. .Li It can specify broad subsets of system calls to trace: system calls, namei translations (translation of file name to inode number), I/O, and signal processing. .Le Here's an example of \fIktrace\fR running against the same program: .Ps $ \f(CBktrace hello\f(CW Hello, World! $ \f(CBkdump\f(CW 20748 ktrace RET ktrace 0 20748 ktrace CALL getpagesize 20748 ktrace RET getpagesize 4096/0x1000 20748 ktrace CALL break(0xadfc) 20748 ktrace RET break 0 20748 ktrace CALL break(0xaffc) 20748 ktrace RET break 0 20748 ktrace CALL break(0xbffc) 20748 ktrace RET break 0 20748 ktrace CALL execve(0xefbfd148,0xefbfd5a8,0xefbfd5b0) 20748 ktrace NAMI "./hello" 20748 hello RET execve 0 20748 hello CALL fstat(0x1,0xefbfd2a4) 20748 hello RET fstat 0 20748 hello CALL getpagesize 20748 hello RET getpagesize 4096/0x1000 20748 hello CALL break(0x7de4) 20748 hello RET break 0 20748 hello CALL break(0x7ffc) 20748 hello RET break 0 20748 hello CALL break(0xaffc) 20748 hello RET break 0 20748 hello CALL ioctl(0x1,TIOCGETA,0xefbfd2e0) 20748 hello RET ioctl 0 20748 hello CALL write(0x1,0x8000,0xe) 20748 hello GIO fd 1 wrote 14 bytes "Hello, World! " 20748 hello RET write 14/0xe 20748 hello CALL exit(0xe) .Pe This display contains the following information in columnar format: .Ls N .Li The process ID of the process. .Li The name of the program from which the process was started. We can see that the name changes after the call to \s10\f(CWexecve\fR\s0. .Li The kind of event. \s10\f(CWCALL\fR\s0 is a system call, \s10\f(CWRET\fR\s0 is a return value from a system call, \s10\f(CWNAMI\fR\s0 is a system internal call to the function \s10\f(CWnamei\fR\s0, which determines the inode number for a pathname, and \s10\f(CWGIO\fR\s0 is a system internal I/O call. .Li The parameters to the call. .Le In this trace, run on an Intel 486 with BSD/OS 1.1, we can see a significant difference from SunOS: there are no shared libraries. Even though each system call produces two lines of output (the call and the return value), the output is much shorter. .Bh "truss" .XX "truss" \fItruss\fR, the System V.4 trace facility, offers the most features: .Ls B .Li It can print statistical information instead of a trace. .Li It can display the argument and environment strings passed to each call to \s10\f(CWexec\fR\s0. .Li It can trace the descendents of the process it is tracing. .Li Like \fIktrace\fR, it can attach to processes which are already running and optionally attach to existing children of the processes to which it attaches. .Li It can trace specific system calls, signals, and interrupts (called \fIfaults\fR in System V terminology). This is a very useful feature: as we saw in the \fIktrace\fR example above, the C library may issue a surprising number of system calls. .Le Here's an example of \fItruss\fR output: .Ps $ \f(CBtruss -f hello\f(CW 511: execve("./hello", 0x08047834, 0x0804783C) argc = 1 511: getuid() = 1004 [ 1004 ] 511: getuid() = 1004 [ 1004 ] 511: getgid() = 1000 [ 1000 ] 511: getgid() = 1000 [ 1000 ] 511: sysi86(SI86FPHW, 0x80036058, 0x80035424, 0x8000E255) = 0x00000000 511: ioctl(1, TCGETA, 0x08046262) = 0 Hello, World! 511: write(1, " H e l l o , W o r l d".., 14) = 14 511: _exit(14) .Pe \fItruss\fR offers a lot of choice in the amount of detail it can display. For example, you can select a verbose parameter display of individual system calls. If we're interested in the parameters to the \s10\f(CWioctl\fR\s0 call, we can enter: .Ps $ \f(CBtruss -f -v ioctl hello\f(CW \fI\&...\f(CW 516: ioctl(1, TCGETA, 0x08046262) = 0 516: iflag=0004402 oflag=0000005 cflag=0002675 lflag=0000073 line=0 516: cc: 177 003 010 030 004 000 000 000 .Pe .XX "termio" In this case, \fItruss\fR shows the contents of the \fItermio\fR structure associated with the \s10\f(CWTCGETA\fR\s0 request--see \*[chterm], pages \*[termios] and \*[TCGETA], for the interpretation of this information. .Bh "Tracing through fork" .XX "tracing, through fork" .XX "fork, tracing through" We've seen that \fIktrace\fR and \fItruss\fR can both trace the child of a \s10\f(CWfork\fR\s0 system call. This is invaluable: as we saw on page \*[fork-debugging], debuggers can't do this. .LP Unfortunately, SunOS trace doesn't support tracing through \s10\f(CWfork\fR\s0. \fItruss\fR does it better than \fIktrace\fR. In extreme cases (like debugging a program of this nature on SunOS 4, where there is no support for trace through \s10\f(CWfork\fR\s0), you might find it an advantage to port to a different machine running an operating system such as Solaris 2 in order to be able to test with \fItruss\fR. Of course, Murphy's law says that the bug won't show up under Solaris 2. .Bh "Tracing network traffic" .XX "tracing, network traffic" .XX "network traffic, tracing" Another place where we can trace is at the network interface. Many processes communicate across the network, and if we have tools to look at this communication, they may help us isolate the part of the package that is causing the problem. .LP Two programs trace message flow across a network: .Ls B .Li .XX "tcpdump" .XX "Berkeley Packet Filter" On BSD systems, \fItcpdump\fR and the \fIBerkeley Packet Filter\fR provide a flexible means of tracing traffic across Internet domain sockets. See \*[appsource], for availability. .Li .XX "trpt" \fItrpt\fR will print a trace from a socket marked for debugging. This function is available on System V.4 as well, though it is not clear what use it is under these circumstances, since System V.4 emulates sockets in a library module. On BSD systems, it comes in a poor second to \fItcpdump\fR. .Le Tracing net traffic is an unusual approach, and we won't consider it here, but in certain circumstances it is an invaluable tool. You can find all you need to know about \fItcpdump\fR in \fITCP/IP Illustrated, Volume 1\fR, by Richard Stevens. .XX "Stevens, W. Richard" @ 3.0 log @Final draft @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.8 1995年06月25日 10:55:49 grog Exp grog $ d4 3 d564 1 a564 1 the the \f(CWlist\fR command: @ 2.8 log @Final draft, second cut @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.7 1995年06月24日 11:45:14 grog Exp grog $ d4 3 @ 2.7 log @Final draft, first cut. @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.6 1995年06月09日 04:31:13 grog Exp grog $ d4 3 d39 5 a43 5 Well, this time you're not quite done after all. Occasionally the program does not work as advertised. What you do now depends on how much programming experience you have. If you are a complete beginner, you could be in trouble--about the only thing you can do (apart from asking somebody else) is to go back and check that you really did configure the package correctly. d76 6 a81 4 a segmentation violation or a bus error.\** .FS See \*[chsignal], page \*[SIGBUSEGV], for more details. .FE d262 7 a268 6 dividing the program into a number of parts, the function \s10\f(CWmain\fR\s0 and the set of functions which \s10\f(CWmain\fR\s0 calls. By single stepping over the function calls until something blows up, we can establish in which part the problem occurs. Then we can restart the program and single step through this function until we find what it calls before dying. This iterative approach sounds slow and tiring, but in fact it works surprisingly well. d379 2 a380 1 .\" and it doesn't look too good anyway. Any ideas? d387 2 a388 2 ^C make: *** [Alloc.o] Interrupt \fIinterrupt make with CTRL-C\f(CW d908 1 a908 1 Other systems don't even offer this facility. In some cases you can determine d914 1 a914 1 \s10\f(CWfork\fR\s0. \fItruss\fR does it better than \fIktrace\fR. In extreme d1183 1 a1183 1 \s10\f(CWfork\fR\s0. \fItruss\fR does it better than \fIktrace\fR. In extreme @ 2.6 log @Remove date from page headers Minor mods @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.5 1995年05月17日 17:33:19 grog Exp grog $ d4 4 d36 5 a40 5 Well, maybe you're not quite done after all. The program does not work as advertised. What you do now depends on how much programming experience you have. If you are a complete beginner, you could be in trouble--about the only thing you can do (apart from asking somebody else) is to go back and check that you really did configure the package correctly. d256 7 a262 7 first approach won't work here. In this case, you can start by conceptually .\" XXX conceptually? dividing the program into the set of functions called by \s10\f(CWmain\fR\s0. We can single step over the function calls until something blows up: then we can restart the program and single step through this function until we find what it calls before dying. This iterative approach sounds slow and tiring, but in fact it works surprisingly well. d891 22 a959 2 .\" XXX James Cox refers to SGI's 'par' (if I read correctly). Are .\" we interested? d971 5 a975 1 vary significantly in their features. We'll look briefly at each. d1172 2 a1173 10 \s10\f(CWfork\fR\s0 system call. This is invaluable: UNIX packages frequently start multiple processes to do the work on hand. Frequently enough, the program that you start does nothing more than spawn a number of other processes and wait for them to stop. One of the significant disadvantages of the \s10\f(CWptrace\fR\s0 interface to debugging is that the process needs to be started by the debugger. Even in SunOS 4, where you can attach the debugger to a process that is already running, there is no way to monitor it from the start. In some cases you can determine how the process was started and start it with the debugger in the same manner. This is not always possible--for example, many child processes communicate with their parent. @ 2.5 log @Major mods after Andy's final draft review @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.4 1995年02月22日 14:24:21 grog Exp grog $ d4 3 d22 1 a22 1 .St "Testing ($Date: 1995年02月22日 14:24:21 $)" d33 15 a47 7 advertised. There are thousands of possible reasons for the problems you encounter when you try to run a buggy executable, and lots of good books explain debugging techniques. In this chapter, we will touch only on aspects of debugging that relate to porting. First we'll attack a typical, if somewhat involved, real-life bug and solve it, discussing the pros and cons on the way. Then we'll look at alternatives to traditional debuggers: kernel and network tracing. d162 1 a162 1 A \fIstack trace\fR command answers the question "Where am I, and how did I get d226 10 a235 9 effectively single steps until you leave the current symbolic line. To add to the confusion, this is also frequently called \fIsingle stepping\fR. This command comes in two flavours, depending on how it treats function calls. One form will execute the function and stop the program at the next line after the call. The other, more thorough form will stop execution at the first executable line of the function. It's important to notice the difference between these two functions: both are extremely useful, but for different things. \fIgdb\fR performs single line execution omitting calls with the \s10\f(CWnext\fR\s0 command, and includes calls with the \s10\f(CWstep\fR\s0 command. d242 2 a243 2 Sometimes this method doesn't work well: the process may end up in no-mans-land, and you see something like: d252 7 a258 6 first approach won't work here. In this case, you can start by dividing the program into the set of functions called by \s10\f(CWmain\fR\s0. We can single step over the function calls until something blows up: then we can restart the program and single step through this function until we find what it calls before dying. This iterative approach sounds slow and tiring, but in fact it works surprisingly well. d267 1 d350 1 a350 1 In order to compile debugging information, we add the compiler flag d353 1 a353 1 We have three options about how to set the flags: d363 7 a369 3 start the compilation with \fImake\fR, but before the compilation completes, we abort it with \s10\f(CWCTRL-C\fR\s0. Using the \fIxterm\fR copy function, we copy the compiler invocation to the command line and add the flags we want: d371 1 a371 1 $ \f(CBrm Alloc.o\f(CW \fI\&remove the old object\f(CW d374 3 a376 2 gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -I../.. \e -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Alloc.c d378 2 d382 31 d428 3 a430 21 .Li We could change \s10\f(CWCFLAGS\fR\s0 from the \fImake\fR command line. Our first attempt doesn't work too well, though: .Ps $ \f(CBmake CFLAGS=-g\f(CW rm -f Alloc.o gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g Alloc.c .Pe .Le \s10\f(CWCFLAGS\fR\s0 included all the compiler flags except \s10\f(CW-c\fR\s0, so we need to write: .Ps $ \f(CBmake CFLAGS='-g -c -I../.. -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL'\f(CW .Pe .LP .XX "Alloc.o" .XX "Initialize.o" .XX "Display.o" We now have a copy of the X Toolkit in which the three files \fIAlloc.o\fR, \fIInitialize.o\fR, and \fIDisplay.o\fR have been compiled with symbols. Next, we need to rebuild \fIxterm\fR. That's straightforward enough: d525 1 a525 1 (gdb) \f(CBf 1\f(CW d532 9 a540 5 \f(CWXtScreenDatabase\fR, not from \f(CW_MergeOptionTables\fR. Why? At the moment it's difficult to say for sure, but it's possible that this difference happened because we removed optimization. In any case, we still have a problem, so we should fix this one first and then go back and look for the other one if solving this problem isn't enough. d609 2 a610 2 It's definitely quicker to look at the instructions than to fight your way through the thick undergrowth in the X11 source tree.: d666 1 a666 1 (gdb) \f(CBl 90\f(CW d705 1 a705 1 different flags, only this time we use the flags \s10\f(CW-E\fR\s0 (stop after d746 1 a746 1 \s10\f(CW#line\fR\s0 directive before the line indicates. d1149 4 a1152 4 started by the debugger. Even in SunOS 4, where you can attach to a process that is already running, there is no way to monitor it from the start. In some cases you can determine how the process was started and start it with the debugger in the same manner. This is not always possible--for example, many @ 2.4 log @Minor mods @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.3 1995年02月03日 13:25:50 grog Exp grog $ d4 3 d19 1 a19 1 .St "Testing ($Date: 1995年02月03日 13:25:50 $)" d32 5 a36 5 debugging techniques. In this chapter, we will only touch on aspects of debugging which relate to porting. First we'll look at debuggers, using a typical, if somewhat involved, real-life bug and solve it, discussing the pros and cons on the way. Then we'll look at alternatives to traditional debuggers: kernel and network tracing. d40 5 a44 5 test suites are available for others. For other packages again there may be test suites which were not designed for the package, but which can be used with it. If there are any tests, you should obviously run them. You might also consider writing some tests and including them as a target \s10\f(CWtest\fR\s0 in the \fIMakefile\fR. d48 2 a49 2 development. A program under development still has bugs which prevent it from funning correctly on any platform, while a ported program has already run d58 1 a58 1 a segmentation violation or a memory error.\** d69 1 a69 1 As a result, the program contains bugs which were not in the original versions. d93 1 a93 1 X11R6) to look at various techniques which you can use to localize and finally d115 1 a115 1 program into pieces which communicate across this line, so you can see what they d118 1 a118 1 Of course, I have all these things. In the following sections we'll look at d124 1 d129 4 a132 4 without wheels--that's a comparable level of technology. The GNU debugger, \fIgdb\fR, is available on just about every platform you're likely to encounter, and though it's not perfect, it runs rings around techniques like putting \fIprintf\fR statements in your programs. d136 2 a137 2 In UNIX, a debugger is a process which takes control of the execution of another process. Most versions of UNIX only allow one way for the debugger to take d169 24 d198 1 a198 1 program counter (the register which contains the address of the next d209 1 a209 1 generates a hardware interrupt which ultimately causes a \s10\f(CWSIGTRAP\fR\s0 d218 6 a223 31 form will execute the function and stop the program at the instruction after the call instruction. The other, more thorough form will stop execution at the first executable line of the function. It's important to notice the difference between these two functions: both are extremely useful, but for different things. \fIgdb\fR performs single line execution omitting calls with the \s10\f(CWnext\fR\s0 command, and includes calls with the \s10\f(CWstep\fR\s0 command. .Li .XX "breakpoint, debugger" .XX "debugger, breakpoint" \fIbreakpoints\fR stop execution of the process when the process attempts to execute an instruction at a certain address. \fIgdb\fR sets breakpoints with the \s10\f(CWbreak\fR\s0 command. .Li .XX "watchpoint, debugger" .XX "debugger, watchpoint" Many modern machines have hardware support for more sophisticated breakpoint mechanisms. For example, the i386 architecture can support four hardware breakpoints on instruction fetch (in other words, traditional breakpoints), memory read or memory write. These features are invaluable in systems which support them; unfortunately, UNIX usually does not. \fIgdb\fR simulates this kind of breakpoint with a so-called \fIwatchpoint\fR. When watchpoints are set, \fIgdb\fR simulates program execution by single-stepping through the program. When the condition (for example, writing to the global variable \s10\f(CWfoo\fR\s0) is fulfilled, the debugger stops the program. This slows down the execution speed by several orders of magnitude, whereas a real hardware breakpoint has no impact on the execution speed.\** .FS Some architectures slow the overall execution speed slightly in order to test the hardware registers. The effect is negligible. .FE d250 4 a253 4 Let's come back to our \fIxterm\fR program. This time we'll use \fIgdb\fR to figure out what is going on. We could, of course, look at the core dump, but in this case we can repeat the problem at will, so we're better off looking at the live program. We enter: d277 2 a278 2 case, where we have just built the complete X11 core system, there's every possibility thhat it is a library bug. As usual, the library was compiled d302 4 a305 4 but to do that we need to know the fact that the \fIXt\fR functions are in fact the X toolkit. If we're using GNU \fImake\fR, or if our \fIMakefile\fR documents directory changes, an alternative would be to go back to our \fImake\fR log and look for the text \fIXt\fR. If we do this, we quickly find d317 1 a317 1 which contains \s10\f(CWXtMemmove\fR\s0. There is a possibility that it is d319 4 a322 1 grep for it: d324 1 a324 1 $ grep XtMemmove *.c d431 2 a432 2 cause a bus error on that statement would be an invalid destination address, but the parameters show that \s10\f(CWdst\fR\s0 appears to be valid. d444 1 a444 1 a segementation violation or a bus error. d474 5 a478 7 tell the truth. Debuggers must be able to write to the text segment (to set breakpoints, for example), and they may either consider it a feature to be able to change the text segment, or they may not even notice. If the write had failed, you could have been sure that the address was not writable, but if the write succeeds, you can't be sure. What we need to know are the exact segment limits. Some debuggers show you the segment limits, but current versions of \fIgdb\fR do not. An alternative is the \fIsize\fR command: d502 1 a502 1 we solving this problem doesn't explain the other problem. d504 6 a509 5 This alone doesn't tell us very much, except that the destination variable is called \s10\f(CWtable\fR\s0, and implicitly that \s10\f(CWmemmove\fR\s0 has been defined as \f(CW_XtMemmove\fR in this source file. We could now look at the source file in an editor in a different X window, but it's easier to list the instructions around the current line with the the \f(CWlist\fR command: d584 4 a587 4 This isn't easy stuff to handle, but it's so typical of what you might find that we'll pull it apart, instruction for instruction. It's easier to understand this discussion if you refer to the diagrams of stack structure in \*[chobj], page \*[complete-stack]. d592 2 a593 2 have seen, this is the second parameter passed to \s10\f(CW_MergeOptionTables\fR\s0, \s10\f(CWnum_src1\fR\s0. a649 1 .QS a656 1 .QQE d660 1 a660 1 very unlikely that such a primitive bug should not have been discovered earlier. d700 1 a700 1 certainly a define. It also defines the preprocessor variable d712 7 a718 4 \fIIntrinsicI.h\fR also contains a number of definitions for XtMemmove, none of which are used in the current environment, but all of which have the parameter sequence \s10\f(CW(dst, src, count)\fR\s0. \s10\f(CWbcopy\fR\s0 has the parameter sequence \s10\f(CW(src, dst, count)\fR\s0. d720 1 a720 1 Somewhere in here is a lesson to be learnt: this is a real bug which occurred in d761 1 a761 2 \s10\f(CW__sxg__\fR\s0 and \s10\f(CW_XBCOPYFUNC\fR\s0. It makes the following decisions: d782 3 a784 3 \s10\f(CW_XBCOPYFUNC\fR\s0 is only ever defined as \s10\f(CW_XtMemmove\fR\s0, which does \fInot\fR have the same parameter sequence as \s10\f(CWbcopy\fR\s0--instead, it has the same parameter sequence as d790 1 a790 1 .LP d810 1 a810 1 In other words, this bug is only likely to occur with System V.3 implementations d846 48 d896 2 d901 9 a909 5 system call and the results that they return. This is a very system-dependent function, and there are a number of different programs to perform the trace: \fItruss\fR runs on System V.4, \fIktrace\fR runs on BSD NET/2 and 4.4BSD derived systems, and \fItrace\fR runs on SunOS 4. They vary significantly in their features. We'll look briefly at each. d912 2 a913 2 \fItrace\fR is a relatively primitive tool supplied with SunOS 4 systems: it can either start a process or attach to an existing process, and it can print d991 1 a991 1 It can attach to processes which are already running. Optionally, it can also d995 1 a995 1 translations (translation of file name to inode number), I/O and signal d1050 1 a1050 1 difference from SunOS: there are no shared libraries--even though each system d1063 1 a1063 3 It can trace the descendents of the process it is tracing. This is particularly useful when the bug occurs in large complexes of processes, and you don't even know which process is causing the problem. d1065 2 a1066 2 It can attach to processes which are already running. Optionally, it can also attach to existing children of the processes to which it attaches. d1070 2 a1071 1 \fIktrace\fR example above, the C library may call a surprising number d1089 2 a1090 2 For example, if we're interested in the parameters to the \s10\f(CWioctl\fR\s0 call, we can enter: d1101 1 a1101 1 \*[termios] and \*[TCGETA], for further information. d1106 5 a1110 5 \s10\f(CWfork\fR\s0 system call. This is invaluable: UNIX program packages frequently start multiple processes to do the work on hand. Frequently enough, the program that you start does nothing more than spawn a number of other processes and wait for them to stop. One of the significant disadvantages of the \s10\f(CWptrace\fR\s0 interface to debugging is that the process needs to be d1117 7 a1123 7 Unfortunately, SunOS trace doesn't support tracing through \s10\f(CWfork\fR\s0, and \fItruss\fR does it better than \fIktrace\fR. In extreme cases (like debugging a program of this nature on SunOS 4, where there is no support for trace through \s10\f(CWfork\fR\s0), you might find it an advantage to port to a different machine running an operating system such as Solaris 2 in order to be able to test with \fItruss\fR. Of course, Murphy's law says that the bug won't show up under Solaris 2. d1127 1 a1127 1 Another interface which we can trace is the network interface. Many processes d1129 1 a1129 1 communication, they may help us isolate the part of the package which is causing d1132 1 a1132 1 Two programs support tracing message flow across a network: d1145 1 a1145 1 BSD systems, it comes a poor second to \fItcpdump\fR. d1149 2 a1150 1 know about \fItcpdump\fR in [Stevens 94]. @ 2.3 log @Mods after Andy's review @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.2 1995年01月25日 14:34:05 grog Exp grog $ d4 3 d16 1 a16 1 .St "Testing ($Date: 1995年01月25日 14:34:05 $)" d311 2 a312 1 .XX "XtMemmove.c" d444 1 a444 1 (gdb) \f(CBx/6i $eip\f(CW \fI\&show the \f(CW @ 2.2 log @Minor mods @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.1 1995年01月25日 13:34:27 grog Exp grog $ d4 3 d13 1 a13 1 .St "Testing ($Date: 1995年01月25日 13:34:27 $)" d17 1 a17 1 After a brief moment of euphoria, you sit down at the terminal and start the d27 4 a30 3 debugging which relate to porting. First we'll look at debugging tools, then we'll take a typical, if somewhat involved, real-life bug and solve it, discussing the pros and cons on the way. d39 26 d66 2 d76 1 d90 7 a96 6 which we can conveniently observe, and check which of them is misbehaving. When we find the piece which is misbehaving, we keep subdividing it further until we find the bug. The emphasis in this method is on \fIconvenient\fR: it doesn't necessarily have to make sense. As long as you can continue to divide your problem area into between two and five parts and localize the problem in one of the parts, it won't take long to find the bug. d98 2 a99 2 So what's a convenient way to look at the problems? That depends entirely on the tools you have at your disposal: d112 2 a113 1 Of course, we have all these things. We'll look at each of them in more detail. d115 3 d125 1 a125 2 \fIprintf\fR statements in your programs. You can find a copy on the companion CD-ROM in the directory \fIgnu/gdb-4.13\fR. d127 2 a128 2 .XX "PTRACE_ATTACH" .XX "debugging, attach process" d130 4 a133 6 process. It uses the system call \s10\f(CWptrace\fR\s0 to start and stop execution and to read and write data from the process. In most cases, the process must agree to this treatment, which means that the debugger must start it. SunOS 4 systems also the \s10\f(CWPTRACE_ATTACH\fR\s0 subcommand of \s10\f(CWptrace\fR\s0 to take control of a process which has not agreed to such treatment. This feature is no longer available with SunOS 5 (Solaris 2). d141 9 d174 2 d189 4 a192 5 first address in the function after the stack trace linkage has been established (see \*[chobj], page \*[fn-entry-breakpoint]). It's important to notice the difference between these two functions: both are extremely useful, but for different things. \fIgdb\fR performs single line execution omitting calls with the \s10\f(CWnext\fR\s0 command, and includes calls with the \s10\f(CWstep\fR\s0 d195 5 a199 7 \fIbreakpoints\fR, as specified by \s10\f(CWptrace\fR\s0, tell the kernel to stop execution of the process and send the debugger a \s10\f(CWSIGTRAP\fR\s0 signal when the process attempts to execute an instruction at a certain address. In traditional architectures, this is achieved by changing the instruction at the specified location to a system call or an illegal instruction--something which will generate a processor interrupt. \fIgdb\fR sets breakpoints with the \s10\f(CWbreak\fR\s0 command. d201 2 d204 10 a213 8 mechanisms than \s10\f(CWptrace\fR\s0 offers. For example, the i386 architecture can support four hardware breakpoints on instruction fetch (in other words, traditional breakpoints), memory read or memory write. These features are invaluable in systems which support them; unfortunately, UNIX usually does not. \fIgdb\fR simulates this kind of breakpoint with a so-called \fIwatchpoint\fR, which involves single-stepping through the program. This slows down the execution speed by several orders of magnitude, whereas a real hardware breakpoint has no impact on the execution speed.\** a217 8 .Li .XX "Where am I?" .XX "How did I get here?" A \fIstack trace\fR command answers the question "Where am I, and how did I get here?", and is almost the most useful of all commands. It's certainly the first thing you should do when examining a core dump or after getting a signal while debugging the program. \fIgdb\fR implements this function with the \s10\f(CWbacktrace\fR\s0 command. d219 2 a220 8 A couple of properties of debuggers help determine how we subdivide our program when testing it: the \s10\f(CWbacktrace\fR\s0 command shows our current position as a hierarchy of functions, and the \s10\f(CWnext\fR\s0 command treats functions as black boxes. There are two possible approaches when using a debugger: .Ls B .Li Wait until something goes wrong, then find out where it happened. This is d223 3 a225 3 .Li If the process does end up in no-mans-land, you may see something like something like: d229 2 a230 2 (gdb) \f(CBbt\f(CW \fI\&abbreviation for \f(BIbacktrace\f(CW #0 0x0 in ?? () \fI\&nowhere\f(CW a239 239 .Le .Ah "Tracing system calls" An alternative approach is to divide the program between system code and user code. Most systems have the ability to trace the parameters supplied to each system call and the results that they return. This is a very system-dependent function, and there are a number of different programs to perform the trace: \fItruss\fR runs on System V.4, \fIktrace\fR runs on BSD NET/2 and 4.4BSD derived systems, and \fItrace\fR runs on SunOS 4. They vary significantly in their features. We'll look briefly at each. .Bh "trace" \fItrace\fR is a relatively primitive tool supplied with SunOS 4 systems: .Ls B .Li It can either start a process or attach to an existing process. .Li It can print summary information or a detailed trace. .Le In particular, it \fIcannot\fR trace the child of a \s10\f(CWfork\fR\s0 call, which is a great disadvantage. Here's an example of \fItrace\fR output with a possibly recognizable program: .Ps $ \f(CBtrace hello\f(CW open ("/usr/lib/ld.so", 0, 040250) = 3 read (3, "".., 32) = 32 mmap (0, 40960, 0x5, 0x80000002, 3, 0) = 0xf77e0000 mmap (0xf77e8000, 8192, 0x7, 0x80000012, 3, 32768) = 0xf77e8000 open ("/dev/zero", 0, 07) = 4 getrlimit (3, 0xf7fff488) = 0 mmap (0xf7800000, 8192, 0x3, 0x80000012, 4, 0) = 0xf7800000 close (3) = 0 getuid () = 1004 getgid () = 1000 open ("/etc/ld.so.cache", 0, 05000100021) = 3 fstat (3, 0xf7fff328) = 0 mmap (0, 4096, 0x1, 0x80000001, 3, 0) = 0xf77c0000 close (3) = 0 open ("/opt/lib/gcc-lib/sparc-sun-sunos".., 0, 01010525) = 3 fstat (3, 0xf7fff328) = 0 getdents (3, 0xf7800108, 4096) = 212 getdents (3, 0xf7800108, 4096) = 0 close (3) = 0 open ("/opt/lib", 0, 056) = 3 getdents (3, 0xf7800108, 4096) = 264 getdents (3, 0xf7800108, 4096) = 0 close (3) = 0 open ("/usr/lib/libc.so.1.9", 0, 023170) = 3 read (3, "".., 32) = 32 mmap (0, 458764, 0x5, 0x80000002, 3, 0) = 0xf7730000 mmap (0xf779c000, 16384, 0x7, 0x80000012, 3, 442368) = 0xf779c000 close (3) = 0 open ("/usr/lib/libdl.so.1.0", 0, 023210) = 3 read (3, "".., 32) = 32 mmap (0, 16396, 0x5, 0x80000002, 3, 0) = 0xf7710000 mmap (0xf7712000, 8192, 0x7, 0x80000012, 3, 8192) = 0xf7712000 close (3) = 0 close (4) = 0 getpagesize () = 4096 brk (0x60d8) = 0 brk (0x70d8) = 0 ioctl (1, 0x40125401, 0xf7ffea8c) = 0 write (1, "Hello, World!\n", 14) = Hello, World! 14 close (0) = 0 close (1) = 0 close (2) = 0 exit (1) = ? .Pe What's all this output? All we did was a simple write, but we have performed a total of 43 system calls. This shows in some detail how much the viewpoint of the world differs when you're on the other side of the system library. This program, which was run on a SparcStation 2 with SunOS 4.1.3, first sets up the shared libraries (the sequences of \s10\f(CWopen\fR\s0, \s10\f(CWread\fR\s0, \s10\f(CWmmap\fR\s0, and \s10\f(CWclose)\fR\s0, then initializes the \s10\f(CWstdio\fR\s0 library (the calls to \s10\f(CWgetpagesize\fR\s0, \s10\f(CWbrk\fR\s0, \s10\f(CWioctl\fR\s0, and \s10\f(CWfstat\fR\s0), and finally writes to \fIstdout\fR and exits. It also looks strange that it closed \fIstdin\fR before writing the output text: again, this is a matter of perspective. The \s10\f(CWstdio\fR\s0 routines buffer the text, and it didn't actually get written until the process exited, just before closing \fIstdout\fR. .Bh "ktrace" \fIktrace\fR is supplied with newer BSD systems. Unlike the other trace programs, it writes unformatted data to a log file (by default, \fIktrace.out\fR), and you need to run another program, \fIkdump\fR, to display the log file. It has the following options: .Ls B .Li It can trace the descendents of the process it is tracing. This is particularly useful when the bug occurs in large complexes of processes, and you don't even know which process is causing the problem. .Li It can attach to processes which are already running. Optionally, it can also attach to existing children of the processes to which it attaches. .Li It can specify broad subsets of system calls to trace: system calls, namei translations (translation of file name to inode number), I/O and signal processing. .Le Here's an example of \fIktrace\fR running against the same program: .Ps $ \f(CBktrace hello\f(CW Hello, World! $ \f(CBkdump\f(CW 20748 ktrace RET ktrace 0 20748 ktrace CALL getpagesize 20748 ktrace RET getpagesize 4096/0x1000 20748 ktrace CALL break(0xadfc) 20748 ktrace RET break 0 20748 ktrace CALL break(0xaffc) 20748 ktrace RET break 0 20748 ktrace CALL break(0xbffc) 20748 ktrace RET break 0 20748 ktrace CALL execve(0xefbfd148,0xefbfd5a8,0xefbfd5b0) 20748 ktrace NAMI "./hello" 20748 hello RET execve 0 20748 hello CALL fstat(0x1,0xefbfd2a4) 20748 hello RET fstat 0 20748 hello CALL getpagesize 20748 hello RET getpagesize 4096/0x1000 20748 hello CALL break(0x7de4) 20748 hello RET break 0 20748 hello CALL break(0x7ffc) 20748 hello RET break 0 20748 hello CALL break(0xaffc) 20748 hello RET break 0 20748 hello CALL ioctl(0x1,TIOCGETA,0xefbfd2e0) 20748 hello RET ioctl 0 20748 hello CALL write(0x1,0x8000,0xe) 20748 hello GIO fd 1 wrote 14 bytes "Hello, World! " 20748 hello RET write 14/0xe 20748 hello CALL exit(0xe) .Pe This display contains the following information in columnar format: .Ls B .Li The first column shows the process ID of the process. .Li The second column shows the name of the program from which the process was started. We can see that the name changes after the call to \s10\f(CWexecve\fR\s0. .Li The third column show the kind of event. \s10\f(CWCALL\fR\s0 is a system call, \s10\f(CWRET\fR\s0 is a return value from a system call, \s10\f(CWNAMI\fR\s0 is a system internal call to the function \s10\f(CWnamei\fR\s0, which determines the inode number for a pathname, and \s10\f(CWGIO\fR\s0 is a system internal I/O call. .Li The fourth column shows the parameters to the call. .Le In this trace, run on an Intel 486 with BSD/OS 1.1, we can see a significant difference from SunOS: there are no shared libraries--even though each system call produces two lines of output (the call and the return value), the output is much shorter. .Bh "truss" \fItruss\fR, the System V.4 trace facility, offers the most features: .Ls B .Li It can print statistical information instead of a trace. .Li It can display the argument and environment strings passed to each call to \s10\f(CWexec\fR\s0. .Li It can trace the descendents of the process it is tracing. This is particularly useful when the bug occurs in large complexes of processes, and you don't even know which process is causing the problem. .Li It can attach to processes which are already running. Optionally, it can also attach to existing children of the processes to which it attaches. .Li It can trace specific system calls, signals, and interrupts (called \fIfaults\fR in System V terminology). This is a very useful feature: as we saw in the \fIktrace\fR example above, the C library may call a surprising number .Le Here's an example of \fItruss\fR output: .Ps $ \f(CBtruss -f hello\f(CW 511: execve("./hello", 0x08047834, 0x0804783C) argc = 1 511: getuid() = 1004 [ 1004 ] 511: getuid() = 1004 [ 1004 ] 511: getgid() = 1000 [ 1000 ] 511: getgid() = 1000 [ 1000 ] 511: sysi86(SI86FPHW, 0x80036058, 0x80035424, 0x8000E255) = 0x00000000 511: ioctl(1, TCGETA, 0x08046262) = 0 Hello, World! 511: write(1, " H e l l o , W o r l d".., 14) = 14 511: _exit(14) .Pe If we're interested in the parameters to the \s10\f(CWioctl\fR\s0 call, we can specify this with the \s10\f(CW-v\fR\s0 (verbose) flag: .Ps $ \f(CBtruss -f -v ioctl hello\f(CW \fI\&...\f(CW 516: ioctl(1, TCGETA, 0x08046262) = 0 516: iflag=0004402 oflag=0000005 cflag=0002675 lflag=0000073 line=0 516: cc: 177 003 010 030 004 000 000 000 .Pe .Bh "Tracing through fork" We've seen that \fIktrace\fR and \fItruss\fR can both trace the child of a \s10\f(CWfork\fR\s0 system call. This is invaluable: UNIX program packages frequently start multiple processes to do the work on hand. Frequently enough, the program that you start does nothing more than spawn a number of other processes and wait for them to stop. One of the significant disadvantages of the \s10\f(CWptrace\fR\s0 interface to debugging is that the process needs to be started by the debugger. Even in SunOS 4, where you can attach to a process that is already running, there is no way to monitor it from the start. In some cases you can determine how the process was started and start it with the debugger in the same manner. This is not always possible--for example, many child processes communicate with their parent. .LP Unfortunately, SunOS trace doesn't support tracing through \s10\f(CWfork\fR\s0, and \fItruss\fR does it better than \fIktrace\fR. In extreme cases (like debugging a program of this nature on SunOS 4, where there is no support for trace through \s10\f(CWfork\fR\s0), you might find it an advantage to port to a different platform (such as Solaris 2) in order to be able to test with \fItruss\fR. Of course, Murphy's law says that the bug won't show up under Solaris 2. .Ah "Tracing network traffic" Another interface which we can trace is the network interface. Many processes communicate across the network, and if we have tools to look at this communication, they may help us isolate the part of the package which is causing the problem. .LP Two programs support tracing message flow across a network: .Ls B .Li \fItcpdump\fR and the \fIBerkeley Packet Filter\fR provide a flexible means of tracing traffic across Internet domain sockets. It is included on the companion CD-ROM as \fInet/tcpdump-2.2.1\fR. .Li \fItrpt\fR will print a trace from a socket marked for debugging. This function is available on System V.4 as well, though it is not clear what use it is under these circumstances, since System V.4 emulates sockets in a library module. On BSD systems, it comes a poor second to \fItcpdump\fR. .Le Tracing net traffic is an unusual approach, and we won't consider it here, but in certain circumstances it is an invaluable tool. You can find all you need to know about \fItcpdump\fR in [Stevens 94]. .XX "Stevens, W. Richard" d241 3 d251 1 a251 1 (gdb) r -display allegro:0 \fI\&run the program\f(CW d256 2 a257 2 (gdb) bt \fI\&look back down the stack\f(CW #0 0x3b0bc in _XtMemmove () \fI\&all these functions come from the X toolkit\f(CW d270 11 a280 17 you had just written, it would probably be a bug in your program. In this case, where we have just built the complete X11 core system, there's every possibility thhat it is a library bug. As usual, the library was compiled without debug information, and without that you hardly have a hope of finding it. Apart from size constraints, there is no reason why you can't include debugging information in a library. The object files in libraries are just the same as any others--we discuss them in detail on page \*[libdef]. If you want, you can build libraries with debugging information, or you can take individual library routines and compile them separately. Unfortunately, the size constraints are significant: without debugging information, the file \fIlibXt.a\fR is about 330 kB long and contains 53 object files. With debugging information, it might easily reach 20 MB, since all the myriad X11 global symbols would be included with each object file in the archive. It's not just a question of disk space: you also need virtual memory during the link phase to accommodate all these symbols. Most of these files don't interest us anyway: the first one that does is the one that contains \s10\f(CW_XtMemmove\fR\s0. So we find where it is and compile it alone with debugging information. d282 12 d297 3 a299 2 the X toolkit. An alternative would be to go back to our \fImake\fR log and look for the text \fIXt\fR. If we do this, we quickly find d303 1 a303 1 mv Makefile Makefile.bak d308 1 d319 6 a324 3 So \s10\f(CWXtMemmove\fR\s0 is in \fIAlloc.c\fR. By the same method, we look for the other functions mentioned in the stack trace and discover that we also need to recompile \fIInitialize.c\fR and \fIDisplay.c\fR. d332 5 a336 2 We can modify the \fIMakefile\fR (the modifications will go away at the next \fImake World\fR, so this is not overly dangerous). d343 2 a344 2 $ \f(CBrm Alloc.o\f(CW \fI\&remove the old object\f(CW $ \f(CBmake\f(CW \fI\&and start make normally\f(CW d348 1 a348 1 make: *** [Alloc.o] Interrupt \fIinterrupt make with CTRL-C\f(CW d351 1 a351 1 $ \f(CBmake\f(CW \fI\&run make to build a new library\f(CW d374 1 a374 1 .IP d380 4 a383 1 .IP d388 1 a388 1 $ \f(CBpushd ../../programs/xterm/\f(CW d398 1 a398 7 .IP The shell \s10\f(CWpushd\fR\s0 command changes directories, like the \s10\f(CWcd\fR\s0 command. Unlike the \s10\f(CWcd\fR\s0 command, it doesn't forget the old directory, and you can change back using \s10\f(CWpushd\fR\s0 without an argument. Since we expect to be back in the library directory again, this saves us some time. .Le d404 1 a404 1 (gdb) \f(CBdir ../../lib/X11\f(CW \fI\&set source paths\f(CW d411 1 a411 1 (gdb) \f(CBr\f(CW \fI\&and run the program\f(CW d415 1 a415 1 0x3ced6 in _XtMemmove (dst=0x342d8 "ミE003円", src=0x41c800 "", length=383) \e d440 1 a440 1 (gdb) \f(CBx/6i $eip\f(CW \fI\&show the \f(CW d471 1 d540 1 a540 1 (gdb) \f(CBp table\f(CW \fI\&look again, to be sure\f(CW d542 2 a543 2 (gdb) \f(CBs\f(CW \fI\&single step into memmove\f(CW _XtMemmove (dst=0x342d8 "ミE003円", src=0x41c800 "", length=384) d565 1 a565 1 (gdb) \f(CBx/8i $eip\f(CW \fI\&look at the next 8 instructions\f(CW d641 1 a641 1 .RS d649 1 a649 1 .RE d671 1 a671 1 make: *** [Initialize.o] Interrupt \fI\&hit CTRL-C\f(CW d678 2 a679 7 It doesn't really matter what you call the output file unless you use \fIemacs\fR. \fIemacs\fR recognizes the suffix \fI.c\fR and loads macros for C programs. .LP As you might have guessed, we now look at the file \fIjunk.c\fR with \fIemacs\fR, though you could use any other editor. We're looking for \s10\f(CWmemmove\fR\s0, of course. We find a definition in d710 9 a718 6 This is a real example. It happened with X11R6, Patch level 3. Somewhere in here is a lesson to be learnt: X11 is one of the most reliable and most portable software packages available, and yet here we have a really primitive bug. The reason it has not been found before is doubtless due to the fact that I was building this version of X11 in an unusual environment (SCO UNIX and GNU libc), and so the usual assumptions didn't apply. d755 2 a756 2 If the variable \s10\f(CWX_NOT_STDC_ENV\fR\s0 is \fInot\fR defined, it assumes ANSI C, unless this is a pre-SVR4 Sun machine. d758 1 a758 1 Otherwise it checks the variables \s10\f(CWSYSV\fR\s0 (for System V.3), d781 1 a781 1 .IP d789 1 a789 1 need to look further for this one. d837 156 a992 1 .Ah "Summary" d995 77 a1071 1 Debugging is a black art. d1073 5 a1077 1 Most bugs in ported software are due to incorrect configuration. d1079 5 a1083 1 FOO d1085 4 a1088 1 @ 2.1 log @Minor mods @ text @d2 1 a2 1 .\" $Id: testing.ms,v 2.0 1994年12月21日 16:58:30 grog Exp grog $ d4 2 d7 1 d10 1 a10 1 .St "Testing ($Date: 1994年12月21日 16:58:30 $)" d79 1 a79 1 .Bh "Symbolic debuggers" d132 3 a134 3 generates a hardware interrupt which ultimately results in a \s10\f(CWSIGTRAP\fR\s0 signal to the debugger. \fIgdb\fR performs this function with the \s10\f(CWstepi\fR\s0 command. d136 2 a137 2 Nowadays, you won't want to look at single machine instructions until you are in deep trouble. Instead, you will execute a \fIsingle line\fR instruction, which d139 6 a144 6 the confusion, this is also frequently called \fIsingle stepping\fR. They come in two flavours, depending on how they treat function calls. One form will execute the function and stop the program at the instruction after the call instruction. The other, more thorough form will stop execution at the first address in the function after the stack trace linkage has been established (see \*[chobj], page \*[fn-entry-breakpoint]). It's important to notice the d164 7 a170 2 \fIwatchpoint\fR. This involves single-stepping through the program, which slows down the execution speed by several orders of magnitude. d174 4 a177 4 A \fIstack trace\fR command is almost the most useful of all commands. It's certainly the first thing you should do when examining a core dump or after getting a signal while debugging the program. It answers the question "Where am I, and how did I get here?" \fIgdb\fR implements this function with the d196 1 a196 1 (gdb) \f(CBbt\f(CW \fI\&abbreviation for \f(BIbacktrace\f(CW d208 1 a208 1 .Bh "Tracing system calls" d216 1 a216 1 .Ch "trace" d286 1 a286 1 .Ch "ktrace" d301 2 a302 1 translations, I/O and signal processing. d304 1 a304 2 Here's an example of \fIktrace\fR running against a possibly recognizable program: d361 1 a361 1 .Ch "truss" d404 1 a404 1 .Ch "Tracing through fork" d412 4 a415 3 that is already running, there is no way to monitor it from the start. This may not be possible--sometimes the child process will only work correctly if it is started by its parent. d417 8 a424 7 Unfortunately, not all packages support this feature, and \fItruss\fR does it better than \fIktrace\fR. In extreme cases (like debugging a program of this nature on SunOS 4, where there is no support for trace through \s10\f(CWfork\fR\s0), you might find it an advantage to port to a different platform (such as Solaris 2) in order to be able to test with \fItruss\fR. Of course, Murphy's law says that the bug won't show up under Solaris 2. .Bh "Tracing network traffic" d443 3 a445 1 in certain circumstances it is an invaluable tool. d459 1 a459 1 (gdb) bt \fI\&look back down the stack\f(CW d474 13 a486 13 where we have just built the complete X11 core system, it looks more like a library bug. As usual, the library was compiled without debug information, and without that you hardly have a hope of finding it. Apart from size constraints, there is no reason why you can't include debugging information in a library. As we discussed on page \*[libdef], the object files in libraries are just the same as any others. If you want, you can build libraries with debugging information, or you can take individual library routines and compile them separately. Unfortunately, the size constraints are significant: without debugging information, the file \fIlibXt.a\fR is about 330 kB long and contains 53 object files. With debugging information, it might easily reach 20 MB, since all the myriad X11 global symbols would be included with each object file in the archive. It's not just a question of disk space: you also need virtual memory during the link phase to accommodate all these symbols. Of course, most of @ 2.0 log @checked in with -k by grog at 1995年01月09日 13:22:41 @ text @d2 1 a2 1 .\" $Id: testing.ms,v 1.25 1994年12月21日 16:58:30 grog Exp grog $ a3 2 .\" Revision 1.25 1994年12月21日 16:58:30 grog .\" Revised, mods for bignuts macros a4 19 .\" Revision 1.24 1994年11月07日 17:14:21 grog .\" Mods after Andy's review .\" .\" Revision 1.23 1994年10月17日 17:30:06 grog .\" *** empty log message *** .\" .\" Revision 1.22 1994年09月30日 17:58:33 grog .\" Snapshot 30 September 94 .\" .\" Revision 1.21 1994年09月01日 13:30:58 grog .\" Snapshot 1 September 1994 .\" .\" Revision 1.20 1994年08月25日 17:10:13 grog .\" Change all names from .roff to .ps, set uniform version number 1.20, minor mods .\" .\" Revision 1.1 1994年08月19日 12:13:37 grog .\" Initial revision .\" .\" a15 1 $ d20 4 a23 4 debugging techniques. We can only touch on aspects of debugging which relate to porting. In this chapter, we'll first look at debugging tools, then we'll take a typical, if somewhat involved, real-life bug and solve it, discussing the pros and cons on the way. d29 2 a30 2 it. If there are any tests, you should obviously run them. Otherwise you should consider writing some and including them as a target \s10\f(CWtest\fR\s0 d57 3 a59 3 necessarily have to make sense, but if you can divide your problem into between two and five parts and easily determine where the problem is, it won't take long to find the bug. d83 3 a85 3 and though it's not perfect, it runs rings around putting \fIprintf\fR statements in your programs. You can find a copy on the companion CD-ROM in the directory \fIgnu/gdb-4.13\fR. @

AltStyle によって変換されたページ (->オリジナル) /