next up previous contents
Next: GER examples Up: Speeding up the Level Previous: GEMV kernel notes Contents

Speeding Up GER, GERU, GERC, HER, HER2, SYR and SYR2

All of these routines rely on the GER primitive for their performance. The hand-written primitives tried by ATLAS may be found in
 ATLAS/tune/blas/ger/CASES.

Most of the discussion of the GEMV primitives applies to the GER primitives as well, so I assume you have read and are familiar with the concepts discussed above. As before, the routines to be timed are given in a kernel description file, <pre>cases.dsc. GER does not have a transpose case, so this file first lists the number of GER primitives to search, followed by that many primitive lines describing them.

GER primitive lines are of the form:

<ID> <flag> <Xunroll> <Yunroll> <filename> "<author(s)>"

The API for the ger primitive is:

#if defined(SCPLX) || defined(DCPLX)
 #ifdef Conj_
 ATL_<pre>ger1c_a1_x1_yX
 #else
 ATL_<pre>ger1u_a1_x1_yX
 #endif
#else
 ATL_<pre>ger1_a1_x1_yX
#endif
 (
 const int M, /* length of X vector */
 const int N, /* length of Y vector */
 const SCALAR alpha,/* ignored, assumed to be one */
 const TYPE *X, /* pointer to X vector */
 const int incX, /* ignored, assumed to be one */
 const TYPE *Y, /* pointer to Y vector */
 const int incY /* increment of Y vector; NOTE: NOT IGNORED */
 TYPE *A, /* pointer to column-major matrix */
 const int lda, /* leading dimension of A, or row-stride */
 );

Assumptions:



Subsections

next up previous contents
Next: GER examples Up: Speeding up the Level Previous: GEMV kernel notes Contents
Clint Whaley 2012年07月10日

AltStyle によって変換されたページ (->オリジナル) /