next up previous contents
Next: Indicating cleanup in the Up: Providing ATLAS with kernel Previous: ATLAS and cleanup Contents

User supplied cleanup

Users can supply cleanup code for the following three cases only, all of which come in the three BETA variants:
  1. M-cleanup $M < N_B$ && $N = K = N_B$
  2. N-cleanup $N < N_B$ && $M = K = N_B$
  3. K-cleanup $K < N_B$ && $M = N = N_B$

The generated code handles all cleanup where more than one dimension is less than the blocking factor. This simplification allows ATLAS to avoid having to test [画像:${N_B}^3$] cases when selecting user cleanup. Once the matrices in question are larger than $N_B$, cleanup with more than one dimension less than $N_B$ rapidly stops being a performance factor. Small matrices where this cleanup is a factor are almost certainly going to be handled by ATLAS's small-case code anyway, so it seems unlikely that this simplification will hurt performance in practice. Section 2.7.5 shows this in a more formal way.

Users need to be very careful when supplying cleanup, because if the user indicates that a dimension must be a compile-time variable, rather than a runtime variable, ATLAS will generate up to $N_B$ routines to handle user cleanup, and since user routines are compiled with all BETA variants, it is possible to generate 9ドル N_B$ cleanup cases, in addition to ATLAS's generated cases. It is therefore recommended that the user supply cleanup that uses run-time arguments whenever possible, and indicate kernels taking compile-time dimensions as not to be used for cleanup.


next up previous contents
Next: Indicating cleanup in the Up: Providing ATLAS with kernel Previous: ATLAS and cleanup Contents
Clint Whaley 2012年07月10日

AltStyle によって変換されたページ (->オリジナル) /