head 3.1; access; symbols; locks; strict; comment @.\" @; 3.1 date 95.09.03.15.39.33; author grog; state Exp; branches; next 3.0; 3.0 date 95.06.25.14.41.01; author grog; state Exp; branches; next 2.5; 2.5 date 95.06.25.11.59.58; author grog; state Exp; branches; next 2.4; 2.4 date 95.06.24.11.01.11; author grog; state Exp; branches; next 2.3; 2.3 date 95.06.09.04.32.57; author grog; state Exp; branches; next 2.2; 2.2 date 95.06.03.08.22.09; author grog; state Exp; branches; next 2.1; 2.1 date 95.02.04.17.14.18; author grog; state Exp; branches; next 2.0; 2.0 date 94.12.17.17.22.18; author grog; state Exp; branches; next ; desc @@ 3.1 log @Better description of int-related types @ text @.\" For emacs, this file is in -*- nroff-fill -*- mode .\" $Id: headers.ms,v 3.0 1995年06月25日 14:41:01 grog Exp grog $ .\" $Log: headers.ms,v $ .\" Revision 3.0 1995年06月25日 14:41:01 grog .\" Final draft .\" .\" Revision 2.4 1995年06月24日 11:01:11 grog .\" Final draft, first cut. .\" .\" Revision 2.3 1995年06月09日 04:32:57 grog .\" Remove date from page headers .\" Minor mods .\" .\" Revision 2.2 1995年06月03日 08:22:09 grog .\" Major mods after Andy's final draft review .\" .\" Revision 2.1 1995年02月04日 17:14:18 grog .\" Minor mods .\" .\" Revision 1.11 1994年12月17日 17:22:18 grog .\" Minor mods for bignuts macros .\" .\" Revision 1.10 1994年11月30日 11:19:48 grog .\" Minor mods .\" .\" Revision 1.8 1994年11月27日 14:49:30 grog .\" Minor review .\" .\" Revision 1.7 1994年09月30日 17:58:33 grog .\" Snapshot 30 September 94 .\" .\" Revision 1.6 1994年09月04日 17:30:48 grog .\" Minor mods .\" .\" Revision 1.4 1994年08月09日 15:50:48 grog .\" Minor mods .\" .\" Revision 1.3 1994年07月08日 16:44:25 grog .\" This file has been taken up into library.roff, and is not .\" longer needed. .\" .\" Revision 1.2 1994年06月09日 12:44:34 grog .\" Minor updates .\" .\" Revision 1.1 1994年05月11日 17:20:31 grog .\" Initial revision .\" .so global.ms .Se \*[nchheaders] "Header files" .St "Header files" .XX "header files" .Pn header-files When the C language was young, header files were required to define structures and occasionally to specify that a function did something out of the ordinary like taking a \s10\f(CWdouble\fR\s0 parameter or returning a \s10\f(CWfloat\fR\s0 result. Then ANSI C and POSIX came along and changed all that. .LP Header files seem a relatively simple idea, but in fact they can be a major source of annoyance in porting. In particular: .Ls B .Li ANSI and POSIX.1 have added a certain structure to the usage of header files, but there are still many old-fashioned headers out there. .Li ANSI and POSIX.1 have also placed more stringent requirements on data types used in header files. This can cause conflicts with older systems, especially if the author has commited the sin of trying to out-guess the header files. .Li C++ has special requirements of header files. If your header files don't fulfil these requirements, the GNU \fIprotoize\fR program can usually fix them. .Li There is still no complete agreement on the names of header files, or in which directories they should be placed. In particular, System V.3 and System V.4 frequently disagree as to whether a header file should be in \fI/usr/include\fR or in \fI/usr/include/sys\fR. .Le .Ah "ANSI C, POSIX.1, and header files" ANSI C and POSIX.1 have had a far-reaching effect on the structure of system header files. We'll look at the changes in the C language in more detail in \*[chcompiler]. The following points are relevant to the use of header files: .Ls B .Li ANSI C prefers to have an ANSI-style prototype for every function call it encounters. If it doesn't find one, it can't check the function call semantics as thoroughly, and it may issue a warning. It's a good idea to enable all such warnings, but this kind of message makes it difficult to recognize the real errors hiding behind the warnings. In C++, the rules are even stricter: if you don't have a prototype, it's an error and your source file doesn't compile. .Li To do a complete job of error checking, ANSI C requires the prototype in the new, embedded form: .Ps int foo (char *zot, int glarp); .Pe and not .Ps int foo (zot, glarp); char *zot; .Pe .IP Old C compilers don't understand this new kind of prototype. .Li Header files usually contain many definitions that are not part of POSIX.1. A mechanism is needed to disable these definitions if you are compiling a program intended to be POSIX.1 compatible.\** .FS Writing your programs to conform to POSIX.1 may be a good idea if you want them to run on as many platforms as possible. On the other hand, it may also be a bad idea: POSIX.1 has very rudimentary facilities in some areas. You may find it more confining than is good for your program. .FE .Le The result of these requirements is spaghetti header files: you frequently see things like this excerpt from the header file \fIstdio.h\fR in 4.4BSD: .XX "stdio.h" .Ps /* * Functions defined in ANSI C standard. */ _\/_BEGIN_DECLS void clearerr _\/_P((FILE *)); int fclose _\/_P((FILE *)); #if !defined(_ANSI_SOURCE) && !defined(_POSIX_SOURCE) extern int sys_nerr; /* perror(3) external variables */ extern _\/_const char *_\/_const sys_errlist[]; #endif void perror _\/_P((const char *)); _\/_END_DECLS /* * Functions defined in POSIX 1003.1. */ #ifndef _ANSI_SOURCE #define L_cuserid 9 /* size for cuserid(); UT_NAMESIZE + 1 */ #define L_ctermid 1024 /* size for ctermid(); PATH_MAX */ _\/_BEGIN_DECLS char *ctermid _\/_P((char *)); _\/_END_DECLS #endif /* not ANSI */ /* * Routines that are purely local. */ #if !defined (_ANSI_SOURCE) && !defined(_POSIX_SOURCE) _\/_BEGIN_DECLS char *fgetln _\/_P((FILE *, size_t *)); _\/_END_DECLS .Pe Well, it \fIdoes\fR look vaguely like C, but this kind of header file scares most people off. A number of conflicts have led to this kind of code: .Ls B .Li The ANSI C library and POSIX.1 carefully define a subset of the total available functionality. If you want to abide strictly to the standards, any extension must be flagged as an error, even if it would work. .Li The C++ language has a different syntax from C, but both languages share a common set of header files. .Le These solutions have caused new problems, which we'll examine in this chapter. .Ah "ANSI and POSIX.1 restrictions" .XX "C language, ANSI restrictions" .XX "C language, POSIX.1 restrictions" Most current UNIX implementations do not conform completely with POSIX.1 and ANSI C, and every implementation offers a number of features that are not part of either standard. A program that conforms with the standards must not use these features. You can specify that you wish your program to be compliant with the standards by defining the preprocessor variables \s10\f(CW_ANSI_SOURCE\fR\s0 or \s10\f(CW_POSIX_SOURCE\fR\s0, which maximizes the portability of the code. It does this by preventing the inclusion of certain definitions. In our example, the array \s10\f(CWsys_errlist\fR\s0, (see \*[chlib], page \*[sys_errlist]), is not part of POSIX.1 or ANSI, so the definition is not included if either preprocessor variable is set. If we refer to \s10\f(CWsys_errlist\fR\s0 anyway, the compiler signifies an error, since the array hasn't been declared. Similarly, \s10\f(CWL_cuserid\fR\s0 is defined in POSIX.1 but not in ANSI C, so it is defined only when \s10\f(CW_POSIX_SOURCE\fR\s0 is defined and \s10\f(CW_ANSI_SOURCE\fR\s0 is not defined. .Ah "Declarations for C++" .XX "C++, function declarations" .XX "function declarations, C++" .Pn mangling C++ has additional requirements of symbol naming: \fIfunction overloading\fR .XX "C++, function overloading" .XX "function overloading, C++" allows different functions to have the same name. Assemblers don't think this is funny at all, and neither do linkers, so the names need to be changed to be unique. In addition, the names need to somehow reflect the class to which they belong, the kind of parameters that the function takes and the kind of value it returns. This is done by a technique called \fIfunction name encoding\fR, .XX "C++, function name encoding" .XX "function name encoding, C++" usually called \fIfunction name mangling\fR. .XX "function name mangling, C++" The parameter and return value type information is appended to the function name according to a predetermined rule. To illustrate this, let's look at a simple function declaration: .Ps double Internal::sense (int a, unsigned char *text, Internal &p, ...); .Pe .Ls B .Li First, two underscores are appended to the name of the function. With the initial underscore we get for the assembler, the name is now \s10\f(CW_sense_\/_\fR\s0. .Li Then the class name, \s10\f(CWInternal\fR\s0 is added. Since the length of the name needs to be specified, this is put in first: \s10\f(CW_sense_\/_8Internal\fR\s0. .Li Next, the parameters are encoded. Simple types like int and char are abbreviated to a single character (in this case, \s10\f(CWi\fR\s0 and \s10\f(CWc\fR\s0. If they have modifiers like \s10\f(CWunsigned\fR\s0, these, too, are encoded and precede the type information. In this case, we get just plain \s10\f(CWi\fR\s0 for the int parameter, and \s10\f(CWPUc\fR\s0 (a \s10\f(CWP\fR\s0ointer to \s10\f(CWU\fR\s0nsigned \s10\f(CWc\fR\s0haracters for the second parameter: \s10\f(CW_sense_\/_8InternaliPUc\fR\s0. .Li Class or structure references again can't be coded ahead of time, so again the length of the name and the name itself is used. In this case, we have a reference, so the letter \s10\f(CWR\fR\s0 is placed in front of the name: \s10\f(CW_sense_\/_8InternaliPUcR8Internal\fR\s0. .Li Finally, the ellipses are specified with the letter \s10\f(CWe\fR\s0: \s10\f(CW_sense_\/_8InternaliPUcR8Internale\fR\s0. .Le For more details on function name mangling, see \fIThe Annotated C++ Reference Manual\fR by Margaret Ellis and Bjarne Stroustrup. .XX "Ellis, Margaret" .XX "Stroustrup, Bjarne" .XX "declarations for C++" .XX "C++, declarations for" .LP .Pn extern-C++ This difference in naming is a problem when a C++ program really needs to call a function written in C. The name in the object file is not mangled, and so the C++ compiler must not output a reference to a mangled name. Theoretically, there could be other differences between C++ calls and C calls that the compiler also needs to take into account. You can't just assume that a function written in another language adheres to the same conventions, so you have to tell it when a called function is written according to C conventions rather than according to C++ conventions. .LP This is done with the following elegant construct: .Ps extern "C" { char *ctermid (char *); }; .Pe In ANSI C, the same declaration would be .Ps char *ctermid (char *); .Pe and in K&R C it would be .Ps char *ctermid (); .Pe It would be a pain to have a separate set of header files for each version. Instead, the implementors defined preprocessor variables which evaluate to language constructs for certain places: .Ls B .Li \s10\f(CW_\/_BEGIN_DECLS\fR\s0 is defined as \s10\f(CWextern "C" {\fR\s0 for C++ and nothing otherwise. .Li \s10\f(CW_\/_END_DECLS\fR\s0 is defined as \s10\f(CW};\fR\s0 for C++ and nothing otherwise. .Li .Pn __P \s10\f(CW_\/_P(foo)\fR\s0 is defined as \s10\f(CWfoo\fR\s0 for C++ and ANSI C, and nothing otherwise. This is the reason why the arguments to \s10\f(CW_\/_P()\fR\s0 are enclosed in double parentheses: the outside level of parentheses gets stripped by the preprocessor. .Le .XX "sys/cdefs.h" .XX "stdio.h" In this implementation, \fIsys/cdefs.h\fR defines these preprocessor variables. What happens if \fIsys/cdefs.h\fR isn't included before \fIstdio.h\fR? Lots of error messages. So one of the first lines in \fIstdio.h\fR is \s10\f(CW#include \fR\s0. This is not the only place that \fIsys/cdefs.h\fR is included: in this particular implementation, from 4.4BSD, it is included from \fIassert.h\fR, \fIdb.h\fR, \fIdirent.h\fR, \fIerr.h\fR, \fIfnmatch.h\fR, \fIfstab.h\fR, \fIfts.h\fR, \fIglob.h\fR, \fIgrp.h\fR, \fIkvm.h\fR, \fIlocale.h\fR, \fImath.h\fR, \fInetdb.h\fR, \fInlist.h\fR, \fIpwd.h\fR, \fIregex.h\fR, \fIregexp.h\fR, \fIresolv.h\fR, \fIrunetype.h\fR, \fIsetjmp.h\fR, \fIsignal.h\fR, \fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR, \fItime.h\fR, \fIttyent.h\fR, \fIunistd.h\fR, \fIutime.h\fR and \fIvis.h\fR. This places an additional load on the compiler, which reads in a 100 line definition file multiple times. It also creates the possibility for compiler errors. \fIsys/cdefs.h\fR defines a preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 in order to avoid this problem: after the obligatory UCB copyright notice, it starts with .Ps #ifndef _CDEFS_H_ #define _CDEFS_H_ #if defined(_\/_cplusplus) #define _\/_BEGIN_DECLS extern "C" { #define _\/_END_DECLS }; #else #define _\/_BEGIN_DECLS #define _\/_END_DECLS #endif .Pe This is a common technique introduced by ANSI C: the preprocessor only processes the body of the header file the first time. After that, the preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 is defined, and the body will not be processed again. .LP There are a couple of things to note about this method: .Ls B .Li There are no hard and fast rules about the naming and definition of these auxiliary variables. The result is that not all header files use this technique. For example, in FreeBSD 1.1, the header file \fImachine/limits.h\fR defines a preprocessor variable .XX "machine/limits.h" \s10\f(CW_MACHINE_LIMITS_H\fR\s0 and only interprets the body of the file if this preprocessor variable was not set on entry. BSD/OS 1.1, on the other hand, does not. The same header file is present, and the text is almost identical, but there is nothing to stop you from including and interpreting \fImachine/limits.h\fR multiple times. The result can be that a package that compiles just fine under FreeBSD may fail to compile under BSD/OS. .\" XXX Andy doesn't want these two bullets because they're not strictly needed .\" for porting. I disagree: once ports start going sour and you start hacking .\" around, it's nice to know what you're doing. .Li The ANSI standard defines numerous standard preprocessor variables to ensure that header files are interpreted only the first time they are included. The variables all start with a leading \s10\f(CW_\fR\s0, and the second character is either another \s10\f(CW_\fR\s0 or an upper-case letter. It's a good idea to avoid using such symbols in your sources. .Li We could save including \fIsys/cdefs.h\fR .XX "sys/cdefs.h" multiple times by checking \s10\f(CW_CDEFS_H_\fR\s0 before including it. Unfortunately, this would establish an undesireable relationship between the two files: if for some reason it becomes necessary to change the name of the preprocessor variable, or perhaps to give it different semantics (like giving it different values at different times, instead of just being defined), you have to go through all the header files that refer to the preprocessor variable and modify them. .Le .Ah "ANSI header files" .XX "header files, ANSI" .XX "ANSI header files" .Pn ansi-headers The ANSI C language definition, also called \fIStandard C\fR, .XX "standard C" was the first to attempt some kind of standardization of header files. As far as it goes, it works well, but unfortunately it covers only a comparatively small number of header files. In ANSI C, .Ls B .Li The only header files you should need to include are \fIassert.h\fR, \fIctype.h\fR, \fIerrno.h\fR, \fIfloat.h\fR, \fIlimits.h\fR, \fIlocale.h\fR, \fImath.h\fR, \fIsetjmp.h\fR, \fIsignal.h\fR, \fIstdarg.h\fR, \fIstddef.h\fR, \fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR and \fItime.h\fR. .Li You may include headers in any order. .Li You may include any header more than once. .Li Header files do not depend on other header files. .Li Header files do not include other header files. .Le If you can get by with just the ANSI header files, you won't have much trouble. Unfortunately, real-life programs usually require headers that aren't covered by the ANSI standard. .Bh "Type information" .XX "C language, type information" .XX "type information, C language" .Pn types.h A large number of system and library calls return information which can be represented in a single machine word. The machine word of the PDP-11, on which the Seventh Edition ran, was only 16 bits wide, and in some cases you had to squeeze the value to get it in the word. For example, the Seventh Edition file system represented an inode number in an \s10\f(CWint\fR\s0, so each file system could have only 65536 inodes. When 32-bit machines were introduced, people quickly took the opportunity to extend the length of these fields, and modern file systems such as \fIufs\fR or \fIvxfs\fR have 32 bit inode numbers. .XX "ufs" .XX "UNIX file system" .XX "vxfs" .XX "Veritas file system" .XX "inode numbers" .LP These changes were an advantage, but they bore a danger with them: nowadays, you can't be sure how long an inode number is. Current systems really do have different sized fields for inode numbers, and this presents a portability problem. Inodes aren't the only thing that has changed: consider the following structure definition, which contains information returned by system calls: .Ps struct process_info { long pid; /* process number */ long start_time; /* time process was started, from time () */ long owner; /* user ID of owner */ long log_file; /* file number of log file */ long log_file_pos; /* current position in log file */ short file_permissions; /* default umask */ short log_file_major; /* major device number for log file */ short log_file_minor; /* minor device number */ short inode; /* inode number of log file */ } .Pe On most modern systems, the \s10\f(CWlong\fR\s0s take up 32 bits and the \s10\f(CWshort\fR\s0s take up 16 bits. Because of alignment constraints, we put the longest data types at the front and the shortest at the end (see \*[chhard], page \*[wordalign] for more details). And for older systems, these fields are perfectly adequate. But what happens if we port a program containing this structure to a 64 bit machine running System V.4 and \fIvxfs\fR? We've already seen that the inode numbers are now 32 bits long, and System V.4 major and minor device numbers also take up more space. If you port this package to 4.4BSD, the field \s10\f(CWlog_file_pos\fR\s0 needs to be 64 bits long. .LP Clearly, it's an oversimplification to assume that any particular kind of value maps to a \s10\f(CWshort\fR\s0 or a \s10\f(CWlong\fR\s0. The correct way to do this is to define a type that describes the value. In modern C, the structure above becomes: .Ps struct process_info { pid_t pid; /* process number */ time_t start_time; /* time process was started, from time () */ uid_t owner; /* user ID of owner */ long log_file; /* file number of log file */ pos_t log_file_pos; /* current position in log file */ mode_t file_permissions; /* default umask */ short log_file_major; /* major device number for log file */ short log_file_minor; /* minor device number */ inode_t inode; /* inode number of log file */ } .Pe It's important to remember that these type definitions are all in the mind of the compiler, and that they are defined in a header file, which is usually called \fIsys/types.h\fR: .XX "sys/types.h" the system handles them as integers of appropriate length. If you define them in this manner, you give the compiler an opportunity to catch mistakes and generate more reliable code. Check your man pages for the types of the arguments on your system if you run into trouble. In addition, \*[apptypes], contains an overview of the more common types used in UNIX systems. .Ah "Classes of header files" .XX "header files, classes" If you look at the directory hierarchy \fI/usr/include\fR, you may be .XX "/usr/include" astounded by the sheer number of header files, over 400 of them on a typical UNIX system. Fortunately, many of them are in subdirectories, and you usually won't have to worry about them, except for one subdirectory: \fI/usr/include/sys\fR. .Bh "/usr/include/sys" .XX "/usr/include/sys" In early versions of UNIX, this directory contained the header files used for compiling the kernel. Nowadays, this directory is intended to contain header files that relate to the UNIX implementation, though the usage varies considerably. You will frequently find files that directly include files from \fI/usr/include/sys\fR. In fact, it may come as a surprise that this is not supposed to be necessary. Often you will also see code like .Ps #ifdef USG /* System V */ #include #else /* non-System V system */ #include #endif .Pe .XX "err.h" This simplified example shows what you need to do because System V keeps the header file \fIerr.h\fR in \fI/usr/include/sys\fR, whereas other flavours keep it in \fI/usr/include\fR. In order to include the file correctly, the source code needs to know what kind of system it is running on. If it guesses wrong (for example, if \s10\f(CWUSG\fR\s0 is not defined when it should be) or if the author of the package didn't allow for System V, either out of ignorance, or because the package has never been compiled on System V before, then the compilation will fail with a message about missing header files. .LP .Pn wait.h Frequently, the decisions made by the kind of code in the last example are incorrect. Some header files in System V have changed between System V.3 and System V.4. If, for example, you port a program written for System V.4 to System V.3, you may find things like .Ps #include .Pe .IP This will fail in most versions of System V.3, because there is no header file \fI/usr/include/wait.h\fR; the file is called .XX "/usr/include/wait.h" \fI/usr/include/sys/wait.h\fR. There are a couple of things you could do here: .Ls B .Li You could start the compiler with a supplementary \s10\f(CW-I/usr/include/sys\fR\s0, which will cause it to search \fI/usr/include/sys\fR for files specified without any pathname component. The problem with this approach is that you need to do it for every package that runs into this problem. .Li You could consider doing what System V.4 does in many cases: create a file called \fI/usr/include/wait.h\fR that contains just an obligatory copyright notice and an \s10\f(CW#include\fR\s0 directive enclosed in \s10\f(CW#ifdef\fR\s0s: .Ps /* THIS IS PUBLISHED NON-PROPRIETARY SOURCE CODE OF O'REILLY */ /* AND ASSOCIATES Inc. */ /* The copyright notice above does not evidence any actual or */ /* intended restriction on the use of this code. */ #ifndef _WAIT_H #define _WAIT_H #include #endif .Pe .Le .Bh "Problems with header files" .XX "header files, problems with" .XX "problems, with header files" It's fair to say that no system is supplied with completely correct system header files. Your system header files will probably suffer from at least one of the following problems: .Ls B .Li "Incorrect" naming. The header files contain the definitions you need, but they are not in the place you would expect. .Li Incomplete definitions. Function prototypes or definitions of structures and constants are missing. .Li Incompatible definitions. The definitions are there, but they don't match your compiler. This is particularly often the case with C++ on systems that don't have a native C++ compiler. The \fIgcc\fR utility program \fIprotoize\fR, .XX "protoize, command" which is run when installing \fIgcc\fR, is supposed to take care of these differences, and it may be of use even if you choose not to install \fIgcc\fR. .Li Incorrect \s10\f(CW#ifdefs\fR\s0. For example, the file may define certain functions only if \s10\f(CW_POSIX_SOURCE\fR\s0 is defined, even though \s10\f(CW_POSIX_SOURCE\fR\s0 is intended to restrict functionality, not to enable it. The System V.4.2 version \fImath.h\fR .XX "math.h" surrounds \s10\f(CWM_PI\fR\s0 (the constant pi) with .Ps #if (_\/_STDC_\/_ && !defined(_POSIX_SOURCE)) || defined(_XOPEN_SOURCE) .Pe .IP In other words, if you include \fImath.h\fR without defining \s10\f(CW_\/_STDC_\/_\fR\s0 (ANSI C) or \s10\f(CW_XOPEN_SOURCE\fR\s0 (X Open compliant), \s10\f(CWM_PI\fR\s0 will not be defined. .Li The header files may contain syntax errors that the native compiler does not notice, but which cause other compilers to refuse them. For example, some versions of XENIX \s10\f(CWcurses.h\fR\s0 contain the lines: .Ps #ifdef M_TERMCAP # include /* Use: cc -DM_TERMCAP ... -lcurses -ltermlib */ #else # ifdef M_TERMINFO # include /* Use: cc -DM_TERMINFO ... -ltinfo [-lx] */ # else ERROR -- Either "M_TERMCAP" or "M_TERMINFO" must be #define'd. # endif #endif .Pe .IP This does not cause problems for the XENIX C compiler, but \fIgcc\fR, for one, complains about the unterminated character constant starting with \s10\f(CWdefine'd\fR\s0. .Li The header files may be "missing". In the course of time, header files have come and gone, and the definitions have been moved to other places. In particular, the definitions that used to be in \fIstrings.h\fR have been moved to \fIstring.h\fR (and changed .XX "string.h, header file" .XX "strings.h, header file" somewhat on the way), and \fItermio.h\fR has become .XX "termio.h, header file" .XX "termios.h, header file" \fI\s10termios.h\fR (see \*[chterm], page \*[termios] for more details). .Le The solutions to these problems are many and varied. They usually leave you feeling dissatisfied: .Ls B .Li Fix the system header files. This sounds like heresy, but if you have established beyond any reasonable doubt that the header file is to blame, this is about all you can do, assuming you can convince your system administrator that it is necessary. If you do choose this way, be sure to consider whether fixing the header file will break some other program that relies on the behaviour. In addition, you should report the bugs to your vendor and remember to re-apply the updates when you install a newer version of the operating system. .Li Use the system header files, but add the missing definitions in local header files, or, worse, in the individual source files. This is a particularly obnoxious "solution", especially when, as so often, the declarations are not dependent on a particular \fIifdef\fR. In almost any system with reasonably complete header files there will be discrepancies between the declarations in the system header files and the declarations in the package. Even if they are only cosmetic, they will stop an ANSI compiler from compiling. For example, your system header files may declare \s10\f(CWgetpid\fR\s0 to return \s10\f(CWpid_t\fR\s0, but the package declares it to return \s10\f(CWint\fR\s0. .IP About the only legitimate use of this style of "fixing" is to declare functions that will really cause incorrect compilation if you don't declare them. Even then, declare them only inside an \fIifdef\fR for a specific operating system. In the case of \s10\f(CWgetpid\fR\s0, you're better off not declaring it: the compiler will assume the correct return values. Nevertheless, you will see this surprisingly often in packages that have already been ported to a number of operating systems, and it's one of the most common causes of porting problems. .Li Make your own copies of the header files and use them instead. This is the worst idea of all: if anything changes in your system's header files, you will never find out about it. It also means you can't give your source tree to somebody else: in most systems, the header files are subject to copyright. .Le @ 3.0 log @Final draft @ text @d2 1 a2 1 .\" $Id: headers.ms,v 2.4 1995年06月24日 11:01:11 grog Exp grog $ d4 3 d382 19 a400 6 A large number of system and library calls return information as something like an \s10\f(CWint\fR\s0: some value expressible in up to 32 bits. In the Seventh Edition, it was easy to handle this kind of return value: you didn't need to say anything. That was reasonable as long as UNIX had been ported only to a few platforms, but it is an enemy of portability. Consider the following structure definition, which contains information returned by system calls: d420 4 a423 11 structure to a 64 bit machine running System V.4? System V.4 major and minor device numbers are now longer, and if you're using \fIufs\fR .XX "ufs" .XX "UNIX file system" or \fIvxfs\fR, .XX "vxfs" .XX "Veritas file system" your \fIinode\fR .XX "inode numbers" numbers will also be more than 16 bits long. If you port this package to 4.4BSD, the field \s10\f(CWlog_file_pos\fR\s0 needs to be 64 bits long. @ 2.5 log @Final draft, second cut @ text @a599 2 .\" XXX Put this (and other stuff) in section on "fixing broken source"? .\" Grog, later: Not this time round @ 2.4 log @Final draft, first cut. @ text @d2 1 a2 1 .\" $Id: headers.ms,v 2.3 1995年06月09日 04:32:57 grog Exp grog $ d4 3 d152 2 a153 2 Well, it \fIdoes\fR look vaguely like C, but this kind of header file scares off most people. A number of conflicts have led to this kind of code: d549 1 a549 1 versions of Xenix \s10\f(CWcurses.h\fR\s0 contain the lines: d562 1 a562 1 This does not cause problems for the Xenix C compiler, but \fIgcc\fR, for one, @ 2.3 log @Remove date from page headers Minor mods @ text @d2 1 a2 1 .\" $Id: headers.ms,v 2.2 1995年06月03日 08:22:09 grog Exp grog $ d4 4 d598 1 @ 2.2 log @Major mods after Andy's final draft review @ text @d2 1 a2 1 .\" $Id: headers.ms,v 2.1 1995年02月04日 17:14:18 grog Exp grog $ d4 3 d40 1 a40 1 .St "Header files ($Date: 1995年02月04日 17:14:18 $)" d266 1 a266 1 .Pn _\/_P d593 1 @ 2.1 log @Minor mods @ text @d2 1 a2 1 .\" $Id: headers.ms,v 2.0 1994年12月17日 17:22:18 grog Exp grog $ d4 3 d37 1 a37 1 .St "Header files ($Date: 1994年12月17日 17:22:18 $)" d41 2 a42 2 and just occasionally to specify that a function did something out of the ordinary like taking a \s10\f(CWdouble\fR\s0 parameter or returning a d44 1 a44 2 that. We'll look at the changes in more detail in \*[chcompiler], page \*[ansi-C]. d46 23 a68 2 Some of these changes have had a far-reaching effect on the structure of system header files. In particular: d108 3 a110 3 __BEGIN_DECLS void clearerr __P((FILE *)); int fclose __P((FILE *)); d114 1 a114 1 extern __const char *__const sys_errlist[]; d116 1 a116 1 void perror __P((const char *)); d118 1 a118 1 __END_DECLS d127 2 a128 2 __BEGIN_DECLS char *ctermid __P((char *)); d130 1 a130 1 __END_DECLS d137 2 a138 2 __BEGIN_DECLS char *fgetln __P((FILE *, size_t *)); d140 1 a140 1 __END_DECLS d153 1 a153 2 In the following sections we'll look at the problems caused by the solutions to these conflicts. d157 11 a167 11 Although most current UNIX implementations do not cross every \fIt\fR and dot every \fIi\fR when it comes to conformance with POSIX.1 and ANSI C, every implementation offers a number of features that are not part of either standard. A program that conforms with the standards must not use these features. You can specify that you wish your program to be compliant with the standards by defining the preprocessor variables \s10\f(CW_ANSI_SOURCE\fR\s0 or \s10\f(CW_POSIX_SOURCE\fR\s0. This prevents the inclusion of certain definitions. In our example, the array \s10\f(CWsys_errlist\fR\s0, (see \*[chlib], page \*[sys_errlist]), is not part of POSIX.1 or ANSI, so the definition is not included if either preprocessor variable is set. If we refer to \s10\f(CWsys_errlist\fR\s0 anyway, the compiler signifies an error, since the d169 3 a171 2 POSIX.1 but not in ANSI C, so it needs to be defined in a different place under different conditions. d189 1 a189 1 according to a predetermined code. To illustrate this, let's look at a simple d198 1 a198 1 \s10\f(CW_sense__\fR\s0. d202 1 a202 1 \s10\f(CW_sense__8Internal\fR\s0. d210 1 a210 1 the second parameter: \s10\f(CW_sense__8InternaliFUc\fR\s0. d215 1 a215 1 \s10\f(CW_sense__8InternaliFUcR8Internal\fR\s0. d218 1 a218 1 \s10\f(CW_sense__8InternaliFUcR8Internale\fR\s0. d220 2 a221 1 For more details on function name mangling, see [Ellis&Stroustup 90]. d227 9 a235 8 This difference in naming would be a problem when a C++ program really needs to call a function written in C. The name in the object file is not mangled, and so the C++ compiler must not output a reference to a mangled name. Theoretically, there could be other differences between C++ calls and C calls that the compiler also needs to take into account. You can't just assume that a function written in another language adheres to the same conventions, so you have to tell it when a called function is written according to C conventions rather than according to C++ conventions. d252 3 a254 3 It would be a pain to have a separate set of header files for each version. The solution that was chosen was to define preprocessor variables to choose one of the three formats based on the flavour of C or C++ that we are using. They are: d257 2 a258 2 \s10\f(CW__BEGIN_DECLS\fR\s0, which is defined as \s10\f(CWextern "C" {\fR\s0 for C++ and nothing otherwise. d260 2 a261 2 \s10\f(CW__END_DECLS\fR\s0, which is defined as \s10\f(CW};\fR\s0 for C++ and nothing otherwise. d263 4 a266 4 .Pn __P \s10\f(CW__P(foo)\fR\s0, which is defined as \s10\f(CWfoo\fR\s0 for C++ and ANSI C, and nothing otherwise. This is the reason why the arguments to \s10\f(CW__P()\fR\s0 are enclosed in double parentheses: the outside level of a268 1 In this implementation, \fIsys/cdefs.h\fR defines these preprocessor a269 2 variables. What happens if \fIsys/cdefs.h\fR isn't included before \fIstdio.h\fR? Lots of error messages. So just about the first line in d271 15 a285 14 \fIstdio.h\fR is \s10\f(CW#include \fR\s0. This is not the only place that \fIsys/cdefs.h\fR is included: in this particular implementation, it is included from \fIassert.h\fR, \fIdb.h\fR, \fIdirent.h\fR, \fIerr.h\fR, \fIfnmatch.h\fR, \fIfstab.h\fR, \fIfts.h\fR, \fIglob.h\fR, \fIgrp.h\fR, \fIkvm.h\fR, \fIlocale.h\fR, \fImath.h\fR, \fInetdb.h\fR, \fInlist.h\fR, \fIpwd.h\fR, \fIregex.h\fR, \fIregexp.h\fR, \fIresolv.h\fR, \fIrunetype.h\fR, \fIsetjmp.h\fR, \fIsignal.h\fR, \fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR, \fItime.h\fR, \fIttyent.h\fR, \fIunistd.h\fR, \fIutime.h\fR and \fIvis.h\fR. This places an additional load on the compiler, which reads in a 100 line definition file multiple times. It also creates the possibility for compiler errors. \fIsys/cdefs.h\fR .XX "sys/cdefs.h" defines a preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 in order to avoid this problem: after the obligatory UCB copyright notice, it d291 3 a293 3 #if defined(__cplusplus) #define __BEGIN_DECLS extern "C" { #define __END_DECLS }; d295 2 a296 2 #define __BEGIN_DECLS #define __END_DECLS d299 4 a302 3 In other words, the preprocessor only processes the body of the header file the first time. After that, the preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 is defined, and the body will not be processed again. a306 17 This technique of defining preprocessor variables and testing them comes from ANSI C. The ANSI standard defines numerous standard preprocessor variables to ensure that header files are interpreted only the first time they are included. The variables all start with a leading \s10\f(CW_\fR\s0, and the second character is either another \s10\f(CW_\fR\s0 or an upper-case letter. It's a good idea to avoid using such symbols in your sources. .Li We could save including \fIsys/cdefs.h\fR .XX "sys/cdefs.h" multiple times by checking \s10\f(CW_CDEFS_H_\fR\s0 before including it. Unfortunately, this would establish an undesireable relationship between the two files: if for some reason it becomes necessary to change the name of the preprocessor variable, or perhaps to give it different semantics (like giving it different values at different times, instead of just being defined), you have to go through all the header files that refer to the preprocessor variable and modify them. .Li d318 19 d374 1 a374 1 definition: d390 6 a395 6 \s10\f(CWshort\fR\s0s take up 16 bits. That's why we put the \s10\f(CWshort\fR\s0s at the end (see \*[chhard], page \*[wordalign] for more details). And for older systems, these fields are perfectly adequate. But what happens if we port a program containing this structure to a 64 bit machine running System V.4? System V.4 major and minor device numbers are now longer, and if you're using \fIufs\fR d428 5 a432 4 the system handles them as integers of appropriate length. Still, if you define them in this manner, you give the compiler an opportunity to catch mistakes and generate more reliable code. \*[apptypes], contains an overview of the more common types used in UNIX systems. a455 2 This simplified example refers to the fact that System V keeps the header file \fIerr.h\fR d457 7 a463 6 in \fI/usr/include/sys\fR, whereas other flavours keep it in \fI/usr/include\fR. In order to include the file correctly, the source code needs to know what kind of system it is running on. If it guesses wrong (for example, if \s10\f(CWUSG\fR\s0 is not defined when it should be) or if the author of the package didn't cater for System V, either because he didn't know or because the package has never been compiled on System V before, then the d505 3 a507 4 So what are the correct header files to include? The answer is: it depends on your system. It's fair to say that no system is supplied with completely correct system header files. Your system header files will probably suffer from at least one of the following problems: d523 2 a524 2 Incorrect \s10\f(CW#ifdefs\fR\s0. For example, POSIX.1 functions may be defined only if \s10\f(CW_POSIX_SOURCE\fR\s0 is defined, even though d530 1 a530 1 #if (__STDC__ && !defined(_POSIX_SOURCE)) || defined(_XOPEN_SOURCE) d534 1 a534 1 \s10\f(CW__STDC__\fR\s0 (ANSI C) or \s10\f(CW_XOPEN_SOURCE\fR\s0 (X Open d575 4 a578 3 fixing the header file will not break some other program that relies on the behaviour. In addition, you should remember to re-apply the updates when you install a newer version of the operating system. d590 5 a594 5 About the only legitimate use of this style of "fixing" is for functions for which the code will really compile incorrectly if you don't declare them, and then only inside an \fIifdef\fR for a specific operating system. In the case of \s10\f(CWgetpid\fR\s0, you're better off just not declaring it: the compiler will assume the correct return values. Nevertheless, you will see this d599 3 a601 23 worst idea of all of them: if anything changes in your system's header files, you will never find out about it. It also means you can't give your source tree to somebody else: in most systems, the header files are subject to copyright. .Le .Ah "Summary" Header files seem a relatively simple idea, but in fact they can be a major source of annoyance in porting. In particular: .Ls B .Li ANSI and POSIX.1 have added a certain structure to the usage of header files, but there are still many old-fashioned headers out there. .Li ANSI and POSIX.1 have also placed more stringent requirements on data types used in header files. This can cause conflicts with older systems, especially if the author has commited the sin of trying to out-guess the header files. .Li C++ has special requirements of header files. If your header files don't fulfil these requirements, the GNU \fIprotoize\fR program can usually fix them. .Li There is still no complete agreement on the names of header files, or in which directories they should be placed. In particular, System V.3 and System V.4 frequently disagree as to whether a header file should be in \fI/usr/include\fR or in \fI/usr/include/sys\fR. @ 2.0 log @checked in with -k by grog at 1995年01月09日 13:22:41 @ text @d2 1 a2 1 .\" $Id: headers.ms,v 1.11 1994年12月17日 17:22:18 grog Exp grog $ d448 1 d527 3 a529 2 \fIgcc\fR, for one, complains about the unterminated character constant starting with \s10\f(CWdefine'd\fR\s0. @

AltStyle によって変換されたページ (->オリジナル) /