Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/ simdgen Public

A library to generate a SIMD code for AVX-512/SVE from a given function string.

Notifications You must be signed in to change notification settings

herumi/simdgen

Repository files navigation

simdgen ; a simple code generator for x64 AVX-512/AArch64 SVE (beta version)

Abstract

simdgen evaluates a given string and generates a code to apply it to an array.

Supported OS

  • Windows 10 (x64) + Visual Studio
  • Linux (x64/A64FX) + gcc/clang

How to build a library

On Linux

make test

On Windows

mklib ; make a static library
mk -s test/parser_test.cpp

For aarch64 on x64 Linux

cd src/aarch64/xbyak_aarch64
make CXX=aarch64-linux-gnu-g++ -j
make XBYAK_AARCH64=1 PRE=aarch64-linux-gnu- ARCH=aarch64 bin/accuracy_test.exe -j
env QEMU_LD_PREFIX=/usr/aarch64-linux-gnu qemu-aarch64 -cpu max,sve512=on bin/accuracy_test.exe

How to use

>cat t.c
#include <simdgen/simdgen.h>
#include <stdio.h>
int main(int arg, char *argv[])
{
	const char *src = argc == 1 ? "x+3" : argv[1];
	SgCode *sg = SgCreate();
	SgFuncFloat1 addr = (SgFuncFloat1)SgGetFuncAddr(sg, "x");
	const size_t N = 40;
	float x[N], y[N];
	for (size_t i = 0; i < N; i++) {
		x[i] = i;
	}
	addr(y, x, N);
	for (size_t i = 0; i < N; i++) {
		printf("%5.3f ", y[i]);
		if (((i + 1) % 8) == 0) putchar('\n');
	}
	printf("\n");
	SgDestroy(sg);
}
gcc t.c -I ./src -L ./lib -lsimdgen
./a.out
# compute x+3
3.000 4.000 5.000 6.000 7.000 8.000 9.000 10.000
11.000 12.000 13.000 14.000 15.000 16.000 17.000 18.000
19.000 20.000 21.000 22.000 23.000 24.000 25.000 26.000
27.000 28.000 29.000 30.000 31.000 32.000 33.000 34.000
35.000 36.000 37.000 38.000 39.000 40.000 41.000 42.000
./a.out "exp(log(x+1)+x)"
2.718 5.437 8.155 10.873 13.591 16.310 19.028 21.746
24.465 27.183 29.901 32.619 35.338 38.056 40.774 43.493
46.211 48.929 51.647 54.366 57.084 59.802 62.520 65.239
67.957 70.675 73.394 76.112 78.830 81.548 84.267 86.985
89.703 92.422 95.140 97.858 100.576 103.295 106.013 108.731

API

SGcode* SgCreate()

  • create an instance of Sgcode and return the pointer.
  • return null if fail.

void SgDestroy(SgCreate *sg)

  • destroy an instance of sg.

const void* SgGetFuncAddr(Sgcode *sg, const char *src)

  • sg generates a code to compute a function src.
  • src is a single function of x such as log(exp(x)+1).

Function Type

  • typedef void (*SgFuncFloat1)(float *dst, const float *src, size_t n);
    • cast the address of a function such as log(exp(x)+1) to SgFuncFloat1.
  • typedef float (*SgFuncFloat1Reduce)(const float *src, size_t n);
    • cast the address of a function such as red_sum(x^2) to SgFuncFloat1Reduce.

Support functions

  • arithmetic operations (+, -, \*, /)
  • inv(x)
  • exp(x)
  • log(x)
  • cosh(x)
  • `red_sum(x) ; sum all values and return the value
    • This function can be set on the last function.

Optional envrionment variables

SG_OPT can controll this library.

  • debug=1 ; display some debug information
  • unroll=<num> ; unroll the main loop (num=0, 2, 3...,). It will cause an error if all registers are used.
    • unroll=0 means autodetection of unroll (search max unroll <= 4)
  • dump=<file name> ; save the generated code into the file.
    • objdump -M intel -CSlw -D -b binary -m i386 <file name> shows a disassembled code.
    • Use objdump -m aarch64 -D -b binary for Aarch64.
  • logp1=0 ; disable precise computation of log(x) for x is close to 1.
  • var=<variable name> ; the default value is x.

Examples

env SG_OPT="debug=1 unroll=2 dump=code" bin/accuracy_test.exe

License

modified new BSD License http://opensource.org/licenses/BSD-3-Clause

History

2021/Jul/29 public

Author

MITSUNARI Shigeo(herumi@nifty.com)

Sponsors welcome

GitHub Sponsor

About

A library to generate a SIMD code for AVX-512/SVE from a given function string.

Resources

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /