| Related articles |
|---|
| [27 earlier articles] |
| Re: Folk Theorem: Assemblers are superior to Compilers steven.parker@acadiau.ca (1993年11月02日) |
| Re: Folk Theorem: Assemblers are superior to Compilers pardo@cs.washington.edu (1993年11月03日) |
| Re: Folk Theorem: Assemblers are superior to Compilers kanze@us-es.sel.de (James Kanze) (1993年11月03日) |
| Re: Folk Theorem: Assemblers are superior to Compilers vthrc@mailbox.uq.oz.au (Danny Thomas) (1993年11月05日) |
| Re: Folk Theorem: Assemblers are superior to Compilers lenngray@netcom.com (1993年11月07日) |
| Re: Folk Theorem: Assemblers are superior to Compilers rfg@netcom.com (1993年11月13日) |
| Re: Folk Theorem: Assemblers are superior to Compilers synaptx!thymus!daveg@uunet.UU.NET (Dave Gillespie) (1993年11月15日) |
[I wrote:]
>How many languages have a declaration that
>tells the compiler that a given pointer, or even a given integer, is a
>multiple of 16?
Ron Guilmette writes:
> In the case of the C language, we are (I think) fortunate to have certain
> "industry standards", which, in many cases, go beyond the requirements
> laid down by the international ISO C standard.
We know about that industry standard, and it's saved our bacon--- it would
be incredibly painful for the programmer to arrange for proper alignment
if "new" and "malloc" didn't give that guarantee.
I don't think our compiler guarantees arrays on the stack to be
quadword aligned; the documentation certainly doesn't mention any
such guarantee, and we have never needed to check it out.
> In the case of the i860 (in particular) the ps-ABI for this processor does
> indeed require compilers to align all data objects (and members of struct
> and union types) which have type `long double' to 16 bytes boundaries.
I think you may have missed my point: It's not that we want to load one
quad-float at once, it's that we want to load *four* single-floats at
once. Say you're doing a vector "a = b*c" operation; for every one-cycle
multiply, you need three load/stores. With a bit of loop unrolling plus
load/store-quad, you can get your three load/stores per cycle with room to
spare.
This is really an issue of information at the procedure-call boundary.
(In that sense it's a relative of the infamous "noalias" problem.) Say I
have a function
double sum_vector(double *p, int n);
At first glance, the ABI might imply that "sum_vector" can assume that "p"
is quadword aligned on an 860. But of course it can't; there's nothing
stopping the programmer from writing
double array[10];
double last_five = sum_vector(&array[5], 5);
The pointer "p" has the wrong alignment now. And this is nothing specific
to C; even number-friendly FORTRAN has this problem. The only way you can
do it is with exhaustive interprocedural analysis, non-standard
declarations, or having the compiler automatically write "sum_vector" in
the form of
if (happy(p)) <fast-loop> else <slow-loop>
which is hard to make into a general solution.
The compiler we use offers none of these, so the load-quad instruction is
simply out of its reach.
-- Dave
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.