head	3.1;
access;
symbols;
locks; strict;
comment	@.\" @;
3.1
date	95.09.03.15.39.33;	author grog;	state Exp;
branches;
next	3.0;
3.0
date	95.06.25.14.41.01;	author grog;	state Exp;
branches;
next	2.5;
2.5
date	95.06.25.11.59.58;	author grog;	state Exp;
branches;
next	2.4;
2.4
date	95.06.24.11.01.11;	author grog;	state Exp;
branches;
next	2.3;
2.3
date	95.06.09.04.32.57;	author grog;	state Exp;
branches;
next	2.2;
2.2
date	95.06.03.08.22.09;	author grog;	state Exp;
branches;
next	2.1;
2.1
date	95.02.04.17.14.18;	author grog;	state Exp;
branches;
next	2.0;
2.0
date	94.12.17.17.22.18;	author grog;	state Exp;
branches;
next	;
desc
@@
3.1
log
@Better description of int-related types
@
text
@.\" For emacs, this file is in -*- nroff-fill -*- mode
.\" $Id: headers.ms,v 3.0 1995年06月25日 14:41:01 grog Exp grog $
.\" $Log: headers.ms,v $
.\" Revision 3.0 1995年06月25日 14:41:01 grog
.\" Final draft
.\"
.\" Revision 2.4 1995年06月24日 11:01:11 grog
.\" Final draft, first cut.
.\"
.\" Revision 2.3 1995年06月09日 04:32:57 grog
.\" Remove date from page headers
.\" Minor mods
.\"
.\" Revision 2.2 1995年06月03日 08:22:09 grog
.\" Major mods after Andy's final draft review
.\"
.\" Revision 2.1 1995年02月04日 17:14:18 grog
.\" Minor mods
.\"
.\" Revision 1.11 1994年12月17日 17:22:18 grog
.\" Minor mods for bignuts macros
.\"
.\" Revision 1.10 1994年11月30日 11:19:48 grog
.\" Minor mods
.\"
.\" Revision 1.8 1994年11月27日 14:49:30 grog
.\" Minor review
.\"
.\" Revision 1.7 1994年09月30日 17:58:33 grog
.\" Snapshot 30 September 94
.\"
.\" Revision 1.6 1994年09月04日 17:30:48 grog
.\" Minor mods
.\"
.\" Revision 1.4 1994年08月09日 15:50:48 grog
.\" Minor mods
.\"
.\" Revision 1.3 1994年07月08日 16:44:25 grog
.\" This file has been taken up into library.roff, and is not
.\" longer needed.
.\"
.\" Revision 1.2 1994年06月09日 12:44:34 grog
.\" Minor updates
.\"
.\" Revision 1.1 1994年05月11日 17:20:31 grog
.\" Initial revision
.\"
.so global.ms
.Se \*[nchheaders] "Header files"
.St "Header files"
.XX "header files"
.Pn header-files
When the C language was young, header files were required to define structures
and occasionally to specify that a function did something out of the ordinary
like taking a \s10\f(CWdouble\fR\s0 parameter or returning a
\s10\f(CWfloat\fR\s0 result. Then ANSI C and POSIX came along and changed all
that.
.LP
Header files seem a relatively simple idea, but in fact they can be a major
source of annoyance in porting. In particular:
.Ls B
.Li
ANSI and POSIX.1 have added a certain structure to the usage of header files,
but there are still many old-fashioned headers out there.
.Li
ANSI and POSIX.1 have also placed more stringent requirements on data types used
in header files. This can cause conflicts with older systems, especially if the
author has commited the sin of trying to out-guess the header files.
.Li
C++ has special requirements of header files. If your header files don't fulfil
these requirements, the GNU \fIprotoize\fR program can usually fix them.
.Li
There is still no complete agreement on the names of header files, or in which
directories they should be placed. In particular, System V.3 and System V.4
frequently disagree as to whether a header file should be in \fI/usr/include\fR
or in \fI/usr/include/sys\fR.
.Le
.Ah "ANSI C, POSIX.1, and header files"
ANSI C and POSIX.1 have had a far-reaching effect on the structure of system
header files. We'll look at the changes in the C language in more detail in
\*[chcompiler]. The following points are relevant to the use of header files:
.Ls B
.Li
ANSI C prefers to have an ANSI-style prototype for every function call it
encounters. If it doesn't find one, it can't check the function call semantics
as thoroughly, and it may issue a warning. It's a good idea to enable all such
warnings, but this kind of message makes it difficult to recognize the real
errors hiding behind the warnings. In C++, the rules are even stricter: if you
don't have a prototype, it's an error and your source file doesn't compile.
.Li
To do a complete job of error checking, ANSI C requires the prototype in the
new, embedded form:
.Ps
int foo (char *zot, int glarp);
.Pe
and not
.Ps
int foo (zot, glarp);
char *zot;
.Pe
.IP
Old C compilers don't understand this new kind of prototype.
.Li
Header files usually contain many definitions that are not part of POSIX.1. A
mechanism is needed to disable these definitions if you are compiling a program
intended to be POSIX.1 compatible.\**
.FS
Writing your programs to conform to POSIX.1 may be a good idea if you want them
to run on as many platforms as possible. On the other hand, it may also be a
bad idea: POSIX.1 has very rudimentary facilities in some areas. You may find it
more confining than is good for your program.
.FE
.Le
The result of these requirements is spaghetti header files: you frequently see
things like this excerpt from the header file \fIstdio.h\fR in 4.4BSD:
.XX "stdio.h"
.Ps
/*
 * Functions defined in ANSI C standard.
 */
_\/_BEGIN_DECLS
void clearerr _\/_P((FILE *));
int fclose _\/_P((FILE *));
#if !defined(_ANSI_SOURCE) && !defined(_POSIX_SOURCE)
extern int sys_nerr; /* perror(3) external variables */
extern _\/_const char *_\/_const sys_errlist[];
#endif
void perror _\/_P((const char *));
_\/_END_DECLS
/*
 * Functions defined in POSIX 1003.1.
 */
#ifndef _ANSI_SOURCE
#define L_cuserid 9 /* size for cuserid(); UT_NAMESIZE + 1 */
#define L_ctermid 1024 /* size for ctermid(); PATH_MAX */
_\/_BEGIN_DECLS
char *ctermid _\/_P((char *));
_\/_END_DECLS
#endif /* not ANSI */
/*
 * Routines that are purely local.
 */
#if !defined (_ANSI_SOURCE) && !defined(_POSIX_SOURCE)
_\/_BEGIN_DECLS
char *fgetln _\/_P((FILE *, size_t *));
_\/_END_DECLS
.Pe
Well, it \fIdoes\fR look vaguely like C, but this kind of header file scares
most people off. A number of conflicts have led to this kind of code:
.Ls B
.Li
The ANSI C library and POSIX.1 carefully define a subset of the total available
functionality. If you want to abide strictly to the standards, any extension
must be flagged as an error, even if it would work.
.Li
The C++ language has a different syntax from C, but both languages share a
common set of header files.
.Le
These solutions have caused new problems, which we'll examine in this chapter.
.Ah "ANSI and POSIX.1 restrictions"
.XX "C language, ANSI restrictions"
.XX "C language, POSIX.1 restrictions"
Most current UNIX implementations do not conform completely with POSIX.1 and
ANSI C, and every implementation offers a number of features that are not part
of either standard. A program that conforms with the standards must not use
these features. You can specify that you wish your program to be compliant with
the standards by defining the preprocessor variables \s10\f(CW_ANSI_SOURCE\fR\s0
or \s10\f(CW_POSIX_SOURCE\fR\s0, which maximizes the portability of the code.
It does this by preventing the inclusion of certain definitions. In our
example, the array \s10\f(CWsys_errlist\fR\s0, (see \*[chlib], page
\*[sys_errlist]), is not part of POSIX.1 or ANSI, so the definition is not
included if either preprocessor variable is set. If we refer to
\s10\f(CWsys_errlist\fR\s0 anyway, the compiler signifies an error, since the
array hasn't been declared. Similarly, \s10\f(CWL_cuserid\fR\s0 is defined in
POSIX.1 but not in ANSI C, so it is defined only when
\s10\f(CW_POSIX_SOURCE\fR\s0 is defined and \s10\f(CW_ANSI_SOURCE\fR\s0 is not
defined.
.Ah "Declarations for C++"
.XX "C++, function declarations"
.XX "function declarations, C++"
.Pn mangling
C++ has additional requirements of symbol naming: \fIfunction overloading\fR
.XX "C++, function overloading"
.XX "function overloading, C++"
allows different functions to have the same name. Assemblers don't think this
is funny at all, and neither do linkers, so the names need to be changed to be
unique. In addition, the names need to somehow reflect the class to which they
belong, the kind of parameters that the function takes and the kind of value it
returns. This is done by a technique called \fIfunction name encoding\fR,
.XX "C++, function name encoding"
.XX "function name encoding, C++"
usually called \fIfunction name mangling\fR.
.XX "function name mangling, C++"
The parameter and return value type information is appended to the function name
according to a predetermined rule. To illustrate this, let's look at a simple
function declaration:
.Ps
double Internal::sense (int a, unsigned char *text, Internal &p, ...);
.Pe
.Ls B
.Li
First, two underscores are appended to the name of the function. With the
initial underscore we get for the assembler, the name is now
\s10\f(CW_sense_\/_\fR\s0.
.Li
Then the class name, \s10\f(CWInternal\fR\s0 is added. Since the length of the
name needs to be specified, this is put in first:
\s10\f(CW_sense_\/_8Internal\fR\s0.
.Li
Next, the parameters are encoded. Simple types like int and char are
abbreviated to a single character (in this case, \s10\f(CWi\fR\s0 and
\s10\f(CWc\fR\s0. If they have modifiers like \s10\f(CWunsigned\fR\s0, these,
too, are encoded and precede the type information. In this case, we get just
plain \s10\f(CWi\fR\s0 for the int parameter, and \s10\f(CWPUc\fR\s0 (a
\s10\f(CWP\fR\s0ointer to \s10\f(CWU\fR\s0nsigned \s10\f(CWc\fR\s0haracters for
the second parameter: \s10\f(CW_sense_\/_8InternaliPUc\fR\s0.
.Li
Class or structure references again can't be coded ahead of time, so again the
length of the name and the name itself is used. In this case, we have a
reference, so the letter \s10\f(CWR\fR\s0 is placed in front of the name:
\s10\f(CW_sense_\/_8InternaliPUcR8Internal\fR\s0.
.Li
Finally, the ellipses are specified with the letter \s10\f(CWe\fR\s0:
\s10\f(CW_sense_\/_8InternaliPUcR8Internale\fR\s0.
.Le
For more details on function name mangling, see \fIThe Annotated C++ Reference
Manual\fR by Margaret Ellis and Bjarne Stroustrup.
.XX "Ellis, Margaret"
.XX "Stroustrup, Bjarne"
.XX "declarations for C++"
.XX "C++, declarations for"
.LP
.Pn extern-C++
This difference in naming is a problem when a C++ program really needs to call a
function written in C. The name in the object file is not mangled, and so the
C++ compiler must not output a reference to a mangled name. Theoretically,
there could be other differences between C++ calls and C calls that the compiler
also needs to take into account. You can't just assume that a function written
in another language adheres to the same conventions, so you have to tell it when
a called function is written according to C conventions rather than according to
C++ conventions.
.LP
This is done with the following elegant construct:
.Ps
extern "C"
 {
 char *ctermid (char *);
 };
.Pe
In ANSI C, the same declaration would be
.Ps
 char *ctermid (char *);
.Pe
and in K&R C it would be
.Ps
 char *ctermid ();
.Pe
It would be a pain to have a separate set of header files for each version.
Instead, the implementors defined preprocessor variables which evaluate to
language constructs for certain places:
.Ls B
.Li
\s10\f(CW_\/_BEGIN_DECLS\fR\s0 is defined as \s10\f(CWextern "C" {\fR\s0 for C++
and nothing otherwise.
.Li
\s10\f(CW_\/_END_DECLS\fR\s0 is defined as \s10\f(CW};\fR\s0 for C++ and nothing
otherwise.
.Li
.Pn __P
\s10\f(CW_\/_P(foo)\fR\s0 is defined as \s10\f(CWfoo\fR\s0 for C++ and ANSI C,
and nothing otherwise. This is the reason why the arguments to
\s10\f(CW_\/_P()\fR\s0 are enclosed in double parentheses: the outside level of
parentheses gets stripped by the preprocessor.
.Le
.XX "sys/cdefs.h"
.XX "stdio.h"
In this implementation, \fIsys/cdefs.h\fR defines these preprocessor variables.
What happens if \fIsys/cdefs.h\fR isn't included before \fIstdio.h\fR? Lots of
error messages. So one of the first lines in \fIstdio.h\fR is \s10\f(CW#include
<sys/cdefs.h>\fR\s0. This is not the only place that \fIsys/cdefs.h\fR is
included: in this particular implementation, from 4.4BSD, it is included from
\fIassert.h\fR, \fIdb.h\fR, \fIdirent.h\fR, \fIerr.h\fR, \fIfnmatch.h\fR,
\fIfstab.h\fR, \fIfts.h\fR, \fIglob.h\fR, \fIgrp.h\fR, \fIkvm.h\fR,
\fIlocale.h\fR, \fImath.h\fR, \fInetdb.h\fR, \fInlist.h\fR, \fIpwd.h\fR,
\fIregex.h\fR, \fIregexp.h\fR, \fIresolv.h\fR, \fIrunetype.h\fR, \fIsetjmp.h\fR,
\fIsignal.h\fR, \fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR, \fItime.h\fR,
\fIttyent.h\fR, \fIunistd.h\fR, \fIutime.h\fR and \fIvis.h\fR. This places an
additional load on the compiler, which reads in a 100 line definition file
multiple times. It also creates the possibility for compiler errors.
\fIsys/cdefs.h\fR defines a preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 in
order to avoid this problem: after the obligatory UCB copyright notice, it
starts with
.Ps
#ifndef _CDEFS_H_
#define _CDEFS_H_
#if defined(_\/_cplusplus)
#define _\/_BEGIN_DECLS extern "C" {
#define _\/_END_DECLS };
#else
#define _\/_BEGIN_DECLS
#define _\/_END_DECLS
#endif
.Pe
This is a common technique introduced by ANSI C: the preprocessor only processes
the body of the header file the first time. After that, the preprocessor
variable \s10\f(CW_CDEFS_H_\fR\s0 is defined, and the body will not be processed
again.
.LP
There are a couple of things to note about this method:
.Ls B
.Li
There are no hard and fast rules about the naming and definition of these
auxiliary variables. The result is that not all header files use this
technique. For example, in FreeBSD 1.1, the header file \fImachine/limits.h\fR
defines a preprocessor variable
.XX "machine/limits.h"
\s10\f(CW_MACHINE_LIMITS_H\fR\s0 and only interprets the body of the file if
this preprocessor variable was not set on entry. BSD/OS 1.1, on the other
hand, does not. The same header file is present, and the text is almost
identical, but there is nothing to stop you from including and interpreting
\fImachine/limits.h\fR multiple times. The result can be that a package that
compiles just fine under FreeBSD may fail to compile under BSD/OS.
.\" XXX Andy doesn't want these two bullets because they're not strictly needed
.\" for porting. I disagree: once ports start going sour and you start hacking
.\" around, it's nice to know what you're doing.
.Li
The ANSI standard defines numerous standard preprocessor variables to ensure
that header files are interpreted only the first time they are included. The
variables all start with a leading \s10\f(CW_\fR\s0, and the second character is
either another \s10\f(CW_\fR\s0 or an upper-case letter. It's a good idea to
avoid using such symbols in your sources.
.Li
We could save including \fIsys/cdefs.h\fR
.XX "sys/cdefs.h"
multiple times by checking \s10\f(CW_CDEFS_H_\fR\s0 before including it.
Unfortunately, this would establish an undesireable relationship between the two
files: if for some reason it becomes necessary to change the name of the
preprocessor variable, or perhaps to give it different semantics (like giving it
different values at different times, instead of just being defined), you have to
go through all the header files that refer to the preprocessor variable and
modify them.
.Le
.Ah "ANSI header files"
.XX "header files, ANSI"
.XX "ANSI header files"
.Pn ansi-headers
The ANSI C language definition, also called \fIStandard C\fR, 
.XX "standard C"
was the first to attempt some kind of standardization of header files. As far
as it goes, it works well, but unfortunately it covers only a comparatively
small number of header files. In ANSI C,
.Ls B
.Li
The only header files you should need to include are \fIassert.h\fR,
\fIctype.h\fR, \fIerrno.h\fR, \fIfloat.h\fR, \fIlimits.h\fR, \fIlocale.h\fR,
\fImath.h\fR, \fIsetjmp.h\fR, \fIsignal.h\fR, \fIstdarg.h\fR, \fIstddef.h\fR,
\fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR and \fItime.h\fR.
.Li
You may include headers in any order.
.Li
You may include any header more than once.
.Li
Header files do not depend on other header files.
.Li
Header files do not include other header files.
.Le
If you can get by with just the ANSI header files, you won't have much trouble.
Unfortunately, real-life programs usually require headers that aren't covered by
the ANSI standard.
.Bh "Type information"
.XX "C language, type information"
.XX "type information, C language"
.Pn types.h
A large number of system and library calls return information which can be
represented in a single machine word. The machine word of the PDP-11, on which
the Seventh Edition ran, was only 16 bits wide, and in some cases you had to
squeeze the value to get it in the word. For example, the Seventh Edition file
system represented an inode number in an \s10\f(CWint\fR\s0, so each file system
could have only 65536 inodes. When 32-bit machines were introduced, people
quickly took the opportunity to extend the length of these fields, and modern
file systems such as \fIufs\fR or \fIvxfs\fR have 32 bit inode numbers.
.XX "ufs"
.XX "UNIX file system"
.XX "vxfs"
.XX "Veritas file system"
.XX "inode numbers"
.LP
These changes were an advantage, but they bore a danger with them: nowadays, you
can't be sure how long an inode number is. Current systems really do have
different sized fields for inode numbers, and this presents a portability
problem. Inodes aren't the only thing that has changed: consider the following
structure definition, which contains information returned by system calls:
.Ps
struct process_info
 {
 long pid; /* process number */
 long start_time; /* time process was started, from time () */
 long owner; /* user ID of owner */
 long log_file; /* file number of log file */
 long log_file_pos; /* current position in log file */
 short file_permissions; /* default umask */
 short log_file_major; /* major device number for log file */
 short log_file_minor; /* minor device number */
 short inode; /* inode number of log file */
 }
.Pe
On most modern systems, the \s10\f(CWlong\fR\s0s take up 32 bits and the
\s10\f(CWshort\fR\s0s take up 16 bits. Because of alignment constraints, we put
the longest data types at the front and the shortest at the end (see \*[chhard],
page \*[wordalign] for more details). And for older systems, these fields are
perfectly adequate. But what happens if we port a program containing this
structure to a 64 bit machine running System V.4 and \fIvxfs\fR? We've already
seen that the inode numbers are now 32 bits long, and System V.4 major and minor
device numbers also take up more space. If you port this package to 4.4BSD, the
field \s10\f(CWlog_file_pos\fR\s0 needs to be 64 bits long.
.LP
Clearly, it's an oversimplification to assume that any particular kind of value
maps to a \s10\f(CWshort\fR\s0 or a \s10\f(CWlong\fR\s0. The correct way to do
this is to define a type that describes the value. In modern C, the structure
above becomes:
.Ps
struct process_info
 {
 pid_t pid; /* process number */
 time_t start_time; /* time process was started, from time () */
 uid_t owner; /* user ID of owner */
 long log_file; /* file number of log file */
 pos_t log_file_pos; /* current position in log file */
 mode_t file_permissions; /* default umask */
 short log_file_major; /* major device number for log file */
 short log_file_minor; /* minor device number */
 inode_t inode; /* inode number of log file */
 }
.Pe
It's important to remember that these type definitions are all in the mind of
the compiler, and that they are defined in a header file, which is usually
called \fIsys/types.h\fR:
.XX "sys/types.h"
the system handles them as integers of appropriate length. If you define them
in this manner, you give the compiler an opportunity to catch mistakes and
generate more reliable code. Check your man pages for the types of the
arguments on your system if you run into trouble. In addition, \*[apptypes],
contains an overview of the more common types used in UNIX systems.
.Ah "Classes of header files"
.XX "header files, classes"
If you look at the directory hierarchy \fI/usr/include\fR, you may be
.XX "/usr/include"
astounded by the sheer number of header files, over 400 of them on a typical
UNIX system. Fortunately, many of them are in subdirectories, and you usually
won't have to worry about them, except for one subdirectory:
\fI/usr/include/sys\fR.
.Bh "/usr/include/sys"
.XX "/usr/include/sys"
In early versions of UNIX, this directory contained the header files used for
compiling the kernel. Nowadays, this directory is intended to contain header
files that relate to the UNIX implementation, though the usage varies
considerably. You will frequently find files that directly include files from
\fI/usr/include/sys\fR. In fact, it may come as a surprise that this is not
supposed to be necessary. Often you will also see code like
.Ps
#ifdef USG /* System V */
#include <sys/err.h>
#else /* non-System V system */
#include <err.h>
#endif
.Pe
.XX "err.h"
This simplified example shows what you need to do because System V keeps the
header file \fIerr.h\fR in \fI/usr/include/sys\fR, whereas other flavours keep
it in \fI/usr/include\fR. In order to include the file correctly, the source
code needs to know what kind of system it is running on. If it guesses wrong
(for example, if \s10\f(CWUSG\fR\s0 is not defined when it should be) or if the
author of the package didn't allow for System V, either out of ignorance, or
because the package has never been compiled on System V before, then the
compilation will fail with a message about missing header files.
.LP
.Pn wait.h
Frequently, the decisions made by the kind of code in the last example are
incorrect. Some header files in System V have changed between System V.3 and
System V.4. If, for example, you port a program written for System V.4 to System
V.3, you may find things like
.Ps
#include <wait.h>
.Pe
.IP
This will fail in most versions of System V.3, because there is no header file
\fI/usr/include/wait.h\fR; the file is called
.XX "/usr/include/wait.h"
\fI/usr/include/sys/wait.h\fR. There are a couple of things you could do here:
.Ls B
.Li
You could start the compiler with a supplementary
\s10\f(CW-I/usr/include/sys\fR\s0, which will cause it to search
\fI/usr/include/sys\fR for files specified without any pathname component. The
problem with this approach is that you need to do it for every package that runs
into this problem.
.Li
You could consider doing what System V.4 does in many cases: create a file
called \fI/usr/include/wait.h\fR that contains just an obligatory copyright
notice and an \s10\f(CW#include\fR\s0 directive enclosed in
\s10\f(CW#ifdef\fR\s0s:
.Ps
/* THIS IS PUBLISHED NON-PROPRIETARY SOURCE CODE OF O'REILLY */
/* AND ASSOCIATES Inc. */
/* The copyright notice above does not evidence any actual or */
/* intended restriction on the use of this code. */
#ifndef _WAIT_H
#define _WAIT_H
#include <sys/wait.h>
#endif
.Pe
.Le
.Bh "Problems with header files"
.XX "header files, problems with"
.XX "problems, with header files"
It's fair to say that no system is supplied with completely correct system
header files. Your system header files will probably suffer from at least one
of the following problems:
.Ls B
.Li
"Incorrect" naming. The header files contain the definitions you need, but they
are not in the place you would expect.
.Li
Incomplete definitions. Function prototypes or definitions of structures and
constants are missing.
.Li
Incompatible definitions. The definitions are there, but they don't match your
compiler. This is particularly often the case with C++ on systems that don't
have a native C++ compiler. The \fIgcc\fR utility program \fIprotoize\fR,
.XX "protoize, command"
which is run when installing \fIgcc\fR, is supposed to take care of these
differences, and it may be of use even if you choose not to install \fIgcc\fR.
.Li
Incorrect \s10\f(CW#ifdefs\fR\s0. For example, the file may define certain
functions only if \s10\f(CW_POSIX_SOURCE\fR\s0 is defined, even though
\s10\f(CW_POSIX_SOURCE\fR\s0 is intended to restrict functionality, not to
enable it. The System V.4.2 version \fImath.h\fR
.XX "math.h"
surrounds \s10\f(CWM_PI\fR\s0 (the constant pi) with
.Ps
#if (_\/_STDC_\/_ && !defined(_POSIX_SOURCE)) &#124;&#124; defined(_XOPEN_SOURCE)
.Pe
.IP
In other words, if you include \fImath.h\fR without defining
\s10\f(CW_\/_STDC_\/_\fR\s0 (ANSI C) or \s10\f(CW_XOPEN_SOURCE\fR\s0 (X Open
compliant), \s10\f(CWM_PI\fR\s0 will not be defined.
.Li
The header files may contain syntax errors that the native compiler does not
notice, but which cause other compilers to refuse them. For example, some
versions of XENIX \s10\f(CWcurses.h\fR\s0 contain the lines:
.Ps
#ifdef M_TERMCAP
# include <tcap.h> /* Use: cc -DM_TERMCAP ... -lcurses -ltermlib */
#else
# ifdef M_TERMINFO
# include <tinfo.h> /* Use: cc -DM_TERMINFO ... -ltinfo [-lx] */
# else
 ERROR -- Either "M_TERMCAP" or "M_TERMINFO" must be #define'd.
# endif
#endif
.Pe
.IP
This does not cause problems for the XENIX C compiler, but \fIgcc\fR, for one,
complains about the unterminated character constant starting with
\s10\f(CWdefine'd\fR\s0.
.Li
The header files may be "missing". In the course of time, header files have
come and gone, and the definitions have been moved to other places. In
particular, the definitions that used to be in \fIstrings.h\fR have been moved
to \fIstring.h\fR (and changed
.XX "string.h, header file"
.XX "strings.h, header file"
somewhat on the way), and \fItermio.h\fR has become
.XX "termio.h, header file"
.XX "termios.h, header file"
\fI\s10termios.h\fR (see \*[chterm], page \*[termios] for more details).
.Le
The solutions to these problems are many and varied. They usually leave you
feeling dissatisfied:
.Ls B
.Li
Fix the system header files. This sounds like heresy, but if you have
established beyond any reasonable doubt that the header file is to blame, this
is about all you can do, assuming you can convince your system administrator
that it is necessary. If you do choose this way, be sure to consider whether
fixing the header file will break some other program that relies on the
behaviour. In addition, you should report the bugs to your vendor and remember
to re-apply the updates when you install a newer version of the operating
system.
.Li
Use the system header files, but add the missing definitions in local header
files, or, worse, in the individual source files. This is a particularly
obnoxious "solution", especially when, as so often, the declarations are not
dependent on a particular \fIifdef\fR. In almost any system with reasonably
complete header files there will be discrepancies between the declarations in
the system header files and the declarations in the package. Even if they are
only cosmetic, they will stop an ANSI compiler from compiling. For example,
your system header files may declare \s10\f(CWgetpid\fR\s0 to return
\s10\f(CWpid_t\fR\s0, but the package declares it to return \s10\f(CWint\fR\s0.
.IP
About the only legitimate use of this style of "fixing" is to declare functions
that will really cause incorrect compilation if you don't declare them. Even
then, declare them only inside an \fIifdef\fR for a specific operating system.
In the case of \s10\f(CWgetpid\fR\s0, you're better off not declaring it: the
compiler will assume the correct return values. Nevertheless, you will see this
surprisingly often in packages that have already been ported to a number of
operating systems, and it's one of the most common causes of porting problems.
.Li
Make your own copies of the header files and use them instead. This is the
worst idea of all: if anything changes in your system's header files, you will
never find out about it. It also means you can't give your source tree to
somebody else: in most systems, the header files are subject to copyright.
.Le
@
3.0
log
@Final draft
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 2.4 1995年06月24日 11:01:11 grog Exp grog $
d4 3
d382 19
a400 6
A large number of system and library calls return information as something like
an \s10\f(CWint\fR\s0: some value expressible in up to 32 bits. In the Seventh
Edition, it was easy to handle this kind of return value: you didn't need to say
anything. That was reasonable as long as UNIX had been ported only to a few
platforms, but it is an enemy of portability. Consider the following structure
definition, which contains information returned by system calls:
d420 4
a423 11
structure to a 64 bit machine running System V.4? System V.4 major and minor
device numbers are now longer, and if you're using \fIufs\fR
.XX "ufs"
.XX "UNIX file system"
or \fIvxfs\fR,
.XX "vxfs"
.XX "Veritas file system"
your \fIinode\fR
.XX "inode numbers"
numbers will also be more than 16 bits long. If you port this package to
4.4BSD, the field \s10\f(CWlog_file_pos\fR\s0 needs to be 64 bits long.
@
2.5
log
@Final draft, second cut
@
text
@a599 2
.\" XXX Put this (and other stuff) in section on "fixing broken source"?
.\" Grog, later: Not this time round
@
2.4
log
@Final draft, first cut.
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 2.3 1995年06月09日 04:32:57 grog Exp grog $
d4 3
d152 2
a153 2
Well, it \fIdoes\fR look vaguely like C, but this kind of header file scares off
most people. A number of conflicts have led to this kind of code:
d549 1
a549 1
versions of Xenix \s10\f(CWcurses.h\fR\s0 contain the lines:
d562 1
a562 1
This does not cause problems for the Xenix C compiler, but \fIgcc\fR, for one,
@
2.3
log
@Remove date from page headers
Minor mods
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 2.2 1995年06月03日 08:22:09 grog Exp grog $
d4 4
d598 1
@
2.2
log
@Major mods after Andy's final draft review
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 2.1 1995年02月04日 17:14:18 grog Exp grog $
d4 3
d40 1
a40 1
.St "Header files ($Date: 1995年02月04日 17:14:18 $)"
d266 1
a266 1
.Pn _\/_P
d593 1
@
2.1
log
@Minor mods
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 2.0 1994年12月17日 17:22:18 grog Exp grog $
d4 3
d37 1
a37 1
.St "Header files ($Date: 1994年12月17日 17:22:18 $)"
d41 2
a42 2
and just occasionally to specify that a function did something out of the
ordinary like taking a \s10\f(CWdouble\fR\s0 parameter or returning a
d44 1
a44 2
that. We'll look at the changes in more detail in \*[chcompiler], page
\*[ansi-C].
d46 23
a68 2
Some of these changes have had a far-reaching effect on the structure of system
header files. In particular:
d108 3
a110 3
__BEGIN_DECLS
void clearerr __P((FILE *));
int fclose __P((FILE *));
d114 1
a114 1
extern __const char *__const sys_errlist[];
d116 1
a116 1
void perror __P((const char *));
d118 1
a118 1
__END_DECLS
d127 2
a128 2
__BEGIN_DECLS
char *ctermid __P((char *));
d130 1
a130 1
__END_DECLS
d137 2
a138 2
__BEGIN_DECLS
char *fgetln __P((FILE *, size_t *));
d140 1
a140 1
__END_DECLS
d153 1
a153 2
In the following sections we'll look at the problems caused by the solutions to
these conflicts.
d157 11
a167 11
Although most current UNIX implementations do not cross every \fIt\fR and dot
every \fIi\fR when it comes to conformance with POSIX.1 and ANSI C, every
implementation offers a number of features that are not part of either standard.
A program that conforms with the standards must not use these features. You can
specify that you wish your program to be compliant with the standards by
defining the preprocessor variables \s10\f(CW_ANSI_SOURCE\fR\s0 or
\s10\f(CW_POSIX_SOURCE\fR\s0. This prevents the inclusion of certain
definitions. In our example, the array \s10\f(CWsys_errlist\fR\s0, (see
\*[chlib], page \*[sys_errlist]), is not part of POSIX.1 or ANSI, so the
definition is not included if either preprocessor variable is set. If we refer
to \s10\f(CWsys_errlist\fR\s0 anyway, the compiler signifies an error, since the
d169 3
a171 2
POSIX.1 but not in ANSI C, so it needs to be defined in a different place under
different conditions.
d189 1
a189 1
according to a predetermined code. To illustrate this, let's look at a simple
d198 1
a198 1
\s10\f(CW_sense__\fR\s0.
d202 1
a202 1
\s10\f(CW_sense__8Internal\fR\s0.
d210 1
a210 1
the second parameter: \s10\f(CW_sense__8InternaliFUc\fR\s0.
d215 1
a215 1
\s10\f(CW_sense__8InternaliFUcR8Internal\fR\s0.
d218 1
a218 1
\s10\f(CW_sense__8InternaliFUcR8Internale\fR\s0.
d220 2
a221 1
For more details on function name mangling, see [Ellis&Stroustup 90].
d227 9
a235 8
This difference in naming would be a problem when a C++ program really needs to
call a function written in C. The name in the object file is not mangled, and
so the C++ compiler must not output a reference to a mangled name.
Theoretically, there could be other differences between C++ calls and C calls
that the compiler also needs to take into account. You can't just assume that a
function written in another language adheres to the same conventions, so you
have to tell it when a called function is written according to C conventions
rather than according to C++ conventions.
d252 3
a254 3
It would be a pain to have a separate set of header files for each version. The
solution that was chosen was to define preprocessor variables to choose one of
the three formats based on the flavour of C or C++ that we are using. They are:
d257 2
a258 2
\s10\f(CW__BEGIN_DECLS\fR\s0, which is defined as \s10\f(CWextern "C" {\fR\s0
for C++ and nothing otherwise.
d260 2
a261 2
\s10\f(CW__END_DECLS\fR\s0, which is defined as \s10\f(CW};\fR\s0 for C++ and
nothing otherwise.
d263 4
a266 4
.Pn __P
\s10\f(CW__P(foo)\fR\s0, which is defined as \s10\f(CWfoo\fR\s0 for C++ and ANSI
C, and nothing otherwise. This is the reason why the arguments to
\s10\f(CW__P()\fR\s0 are enclosed in double parentheses: the outside level of
a268 1
In this implementation, \fIsys/cdefs.h\fR defines these preprocessor
a269 2
variables. What happens if \fIsys/cdefs.h\fR isn't included before
\fIstdio.h\fR? Lots of error messages. So just about the first line in
d271 15
a285 14
\fIstdio.h\fR is \s10\f(CW#include <sys/cdefs.h>\fR\s0. This is not the only
place that \fIsys/cdefs.h\fR is included: in this particular implementation, it
is included from \fIassert.h\fR, \fIdb.h\fR, \fIdirent.h\fR, \fIerr.h\fR,
\fIfnmatch.h\fR, \fIfstab.h\fR, \fIfts.h\fR, \fIglob.h\fR, \fIgrp.h\fR,
\fIkvm.h\fR, \fIlocale.h\fR, \fImath.h\fR, \fInetdb.h\fR, \fInlist.h\fR,
\fIpwd.h\fR, \fIregex.h\fR, \fIregexp.h\fR, \fIresolv.h\fR, \fIrunetype.h\fR,
\fIsetjmp.h\fR, \fIsignal.h\fR, \fIstdio.h\fR, \fIstdlib.h\fR, \fIstring.h\fR,
\fItime.h\fR, \fIttyent.h\fR, \fIunistd.h\fR, \fIutime.h\fR and \fIvis.h\fR.
This places an additional load on the compiler, which reads in a 100 line
definition file multiple times. It also creates the possibility for compiler
errors. \fIsys/cdefs.h\fR
.XX "sys/cdefs.h"
defines a preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 in order to
avoid this problem: after the obligatory UCB copyright notice, it
d291 3
a293 3
#if defined(__cplusplus)
#define __BEGIN_DECLS extern "C" {
#define __END_DECLS };
d295 2
a296 2
#define __BEGIN_DECLS
#define __END_DECLS
d299 4
a302 3
In other words, the preprocessor only processes the body of the header file the
first time. After that, the preprocessor variable \s10\f(CW_CDEFS_H_\fR\s0 is
defined, and the body will not be processed again.
a306 17
This technique of defining preprocessor variables and testing them comes from
ANSI C. The ANSI standard defines numerous standard preprocessor variables to
ensure that header files are interpreted only the first time they are included.
The variables all start with a leading \s10\f(CW_\fR\s0, and the second
character is either another \s10\f(CW_\fR\s0 or an upper-case letter. It's a
good idea to avoid using such symbols in your sources.
.Li
We could save including \fIsys/cdefs.h\fR
.XX "sys/cdefs.h"
multiple times by checking \s10\f(CW_CDEFS_H_\fR\s0 before including it.
Unfortunately, this would establish an undesireable relationship between the two
files: if for some reason it becomes necessary to change the name of the
preprocessor variable, or perhaps to give it different semantics (like giving it
different values at different times, instead of just being defined), you have to
go through all the header files that refer to the preprocessor variable and
modify them.
.Li
d318 19
d374 1
a374 1
definition:
d390 6
a395 6
\s10\f(CWshort\fR\s0s take up 16 bits. That's why we put the
\s10\f(CWshort\fR\s0s at the end (see \*[chhard], page \*[wordalign] for more
details). And for older systems, these fields are perfectly adequate. But what
happens if we port a program containing this structure to a 64 bit machine
running System V.4? System V.4 major and minor device numbers are now longer,
and if you're using \fIufs\fR
d428 5
a432 4
the system handles them as integers of appropriate length. Still, if you define
them in this manner, you give the compiler an opportunity to catch mistakes and
generate more reliable code. \*[apptypes], contains an overview of the more
common types used in UNIX systems.
a455 2
This simplified example refers to the fact that System V keeps the header file
\fIerr.h\fR
d457 7
a463 6
in \fI/usr/include/sys\fR, whereas other flavours keep it in
\fI/usr/include\fR. In order to include the file correctly, the source code
needs to know what kind of system it is running on. If it guesses wrong (for
example, if \s10\f(CWUSG\fR\s0 is not defined when it should be) or if the
author of the package didn't cater for System V, either because he didn't know
or because the package has never been compiled on System V before, then the
d505 3
a507 4
So what are the correct header files to include? The answer is: it depends on
your system. It's fair to say that no system is supplied with completely
correct system header files. Your system header files will probably suffer from
at least one of the following problems:
d523 2
a524 2
Incorrect \s10\f(CW#ifdefs\fR\s0. For example, POSIX.1 functions may be defined
only if \s10\f(CW_POSIX_SOURCE\fR\s0 is defined, even though
d530 1
a530 1
#if (__STDC__ && !defined(_POSIX_SOURCE)) &#124;&#124; defined(_XOPEN_SOURCE)
d534 1
a534 1
\s10\f(CW__STDC__\fR\s0 (ANSI C) or \s10\f(CW_XOPEN_SOURCE\fR\s0 (X Open
d575 4
a578 3
fixing the header file will not break some other program that relies on the
behaviour. In addition, you should remember to re-apply the updates when you
install a newer version of the operating system.
d590 5
a594 5
About the only legitimate use of this style of "fixing" is for functions for
which the code will really compile incorrectly if you don't declare them, and
then only inside an \fIifdef\fR for a specific operating system. In the case of
\s10\f(CWgetpid\fR\s0, you're better off just not declaring it: the compiler
will assume the correct return values. Nevertheless, you will see this
d599 3
a601 23
worst idea of all of them: if anything changes in your system's header files,
you will never find out about it. It also means you can't give your source tree
to somebody else: in most systems, the header files are subject to copyright.
.Le
.Ah "Summary"
Header files seem a relatively simple idea, but in fact they can be a major
source of annoyance in porting. In particular:
.Ls B
.Li
ANSI and POSIX.1 have added a certain structure to the usage of header files,
but there are still many old-fashioned headers out there.
.Li
ANSI and POSIX.1 have also placed more stringent requirements on data types used
in header files. This can cause conflicts with older systems, especially if the
author has commited the sin of trying to out-guess the header files.
.Li
C++ has special requirements of header files. If your header files don't fulfil
these requirements, the GNU \fIprotoize\fR program can usually fix them.
.Li
There is still no complete agreement on the names of header files, or in which
directories they should be placed. In particular, System V.3 and System V.4
frequently disagree as to whether a header file should be in \fI/usr/include\fR
or in \fI/usr/include/sys\fR.
@
2.0
log
@checked in with -k by grog at 1995年01月09日 13:22:41
@
text
@d2 1
a2 1
.\" $Id: headers.ms,v 1.11 1994年12月17日 17:22:18 grog Exp grog $
d448 1
d527 3
a529 2
\fIgcc\fR, for one, complains about the unterminated character constant starting
with \s10\f(CWdefine'd\fR\s0.
@
</div><div class="naked_ctrl">
<form action="/index.cgi/contrast" method="get" name="gate">
<p><a href="http://altstyle.alfasado.net">AltStyle</a> によって変換されたページ <a href="http://www.lemis.com/grog/Documentation/PUS/headers.ms">(-&gt;オリジナル)</a>
/ <label>アドレス: <input type="text" name="naked_post_url" value="http://www.lemis.com/grog/Documentation/PUS/headers.ms" size="22" /></label> <label>モード: <select name="naked_post_mode">
<option value="default">デフォルト</option>
<option value="speech">音声ブラウザ</option>
<option value="ruby">ルビ付き</option>
<option value="contrast" selected="selected">配色反転</option>
<option value="larger-text">文字拡大</option>
<option value="mobile">モバイル</option>
</select>
<input type="submit" value="表示" />
</p>
</form>
</div>