Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

notjedi/ruml

Repository files navigation

ruml

The goal of this project is to implement a tiny inference only library for running ML models. I want this to be something like ggml and tinygrad.

The idea is to support different optimization backends like:

  • Accelerate
  • AVX
  • openblas
  • cuBLAS (not sure about cuBLAS)
  • naive CPU only (fallback)
  • etc

The roadmap right now is more or less like this:

  • implement a minimal tensor class with support for broadcasting and dynamic shapes
  • implement a CPU only backend and write tests for different ops
  • write other backends
  • support fp16, int8 and quantization
  • a demo of the lib using llama or something similar
  • would also like this to work on vision models like segment anything, resnet, etc

About

[WIP] a tiny inference only tensor library.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

AltStyle によって変換されたページ (->オリジナル) /