OpenAI-compatible chat/completion proxy backed by https://chat.z.ai/.
- Supports
POST /v1/chat/completions - Supports
POST /v1/responses - Creates a fresh upstream chat for every request
- Preserves reasoning output separately from final answer text
- Reuses
ZAI_SESSION_TOKENdirectly or refreshes it fromZAI_JWT
- Python 3.12+
uv- One of:
ZAI_JWTZAI_SESSION_TOKEN
export ZAI_JWT='your-jwt' uv run python -m zai2api
Or with the installed script:
export ZAI_JWT='your-jwt' uv run zai2api
Default bind address is 0.0.0.0:8000.
ZAI_JWT: preferred auth source; used to fetch a fresh session tokenZAI_SESSION_TOKEN: optional direct session token reuseDEFAULT_MODEL: defaults toglm-5- Available public model ids:
glm-5,glm-5.1,glm-5-turboand their-nothinkingvariants HOST: defaults to0.0.0.0PORT: defaults to8000LOG_LEVEL: defaults toinfoREQUEST_TIMEOUT: defaults to120
curl http://127.0.0.1:8000/v1/chat/completions \ -H 'content-type: application/json' \ -d '{ "model": "glm-5", "messages": [ {"role": "system", "content": "Be concise."}, {"role": "user", "content": "Say hello."} ] }'
curl http://127.0.0.1:8000/v1/responses \ -H 'content-type: application/json' \ -d '{ "model": "glm-5", "input": "Say hello." }'