Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A Python API for Llama.cpp, allowing you to fetch routes from devices using Tailscale

Notifications You must be signed in to change notification settings

MathieuDubart/Llama-cpp-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

4 Commits

Repository files navigation

Llama.cpp API

Prerequisites

  • Setup Tailscale on your devices
  • Git clone G. Gerganov's Llama.cpp from his Github repository
  • Download a .gguf model (mistral-7b-instruct-v0.2.Q4_K_M.gguf is recommended)
  • Place it inside ./models

Starting project

  • Git clone this repository inside Llama.cpp one's git clone git@github.com:MathieuDubart/Llama-cpp-api.git
  • Open server.py and change model path to match with your model name
  • Run python3 server.py in root directory
  • You can now access your API routes on every linked to Tailscale device, with your hosting device's Tailscale IP (Port 5000)) (e.g: 100.x.x.x:5000/generate)

About

A Python API for Llama.cpp, allowing you to fetch routes from devices using Tailscale

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /