zkh2016/sgemm

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
sgemm.cu		sgemm.cu

Repository files navigation

sgemm

The implementation method refer to the maxas.

performance

The test environment: ubuntu18.04, cuda10, 1080ti
The code only supports limited input matrix, not universal adaptation, only for learning. Here is the GFLOP for testing different size matrices

N	cublas	sgemm	sgemm/cublas
512	4451.6069	3587.3280	80%
1024	7856.5241	6640.6945	84%
2048	9409.4447	8769.9500	93%
4096	10180.4288	9708.4873	95%

About

Cuda-based matrix multiplication, compared with cuBLas performance. Refer to the https://github.com/NervanaSystems/maxas/wiki/SGEMM

Releases

No releases published

Packages

Contributors

Languages

Cuda 99.0%
Makefile 1.0%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zkh2016/sgemm

Folders and files

Latest commit

History

Repository files navigation

sgemm

performance

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sgemm

performance

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages