Building in the LLM serving & agent space β inference, quantization, and a bit of agent RL.
vLLM ecosystem Β· now heading deeper into infra π οΈ
FP8 / INT8 quantization Β· efficient inference & serving Β· multi-agent systems Β· RLVR for small models
| Project | What it is |
|---|---|
| langextract-vllm | A vLLM provider plugin for LangExtract β run structured extraction on a local vLLM backend |
| claude-code-architecture | Deep reverse-engineering of the Claude Code CLI (v2.1.88) internals from sourcemaps |
| mobileground-r1 | A small-VLM phone-GUI grounding agent, trained with RLVR (GRPO) |
| vantage | AI Job Decision Copilot β scan, score, advise, decide |
Where I contribute upstream:
- Step-Audio2 integration into the vLLM-Omni serving stack
- Stable Audio 3 integration into the vLLM-Omni serving stack
π« 421774554@qq.com