GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
-
Updated
May 27, 2025 - C++
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Find secrets with Gitleaks 🔑
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
Official inference library for Mistral models
OpenVINOTM is an open source toolkit for optimizing and deploying AI inference
High-speed Large Language Model Serving for Local Deployment
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Superduper: End-to-end framework for building custom AI applications and agents.
Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Open-source implementation of AlphaEvolve
The data plane for agents. Arch is a models-native proxy server that handles the plumbing work in AI: agent routing & hand off, guardrails, zero-code logs and traces, unified access to LLMs from OpenAI, Anthropic, Ollama, etc. Build agents faster, and scale them reliably.
FlashInfer: Kernel Library for LLM Serving
Simple, scalable AI model deployment on GPU clusters
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."