Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: crossps/llm-bridge-cache

v1.2.0 - GLM (Z.ai) provider

31 May 18:35
@crossps crossps

Choose a tag to compare

Adds GLM (Zhipu / Z.ai) as a built-in provider.

  • GLM models (glm-4.6, glm-4.5, glm-4.5-flash, ...) route to https://api.z.ai/api/paas/v4 — Zhipu's OpenAI-compatible endpoint — via the existing passthrough path.
  • Set GLM_API_KEY and use a glm-* model. Override baseUrl for the coding-plan (/api/coding/paas/v4) or mainland (open.bigmodel.cn) endpoints.
  • Works across all three inbound formats; cross-converts for /v1/messages.
  • 18 tests, all passing.
Assets 2
Loading

v1.1.0 - /v1/messages and /v1/responses

31 May 17:06
@crossps crossps

Choose a tag to compare

Adds two more inbound API formats so almost any client works as-is.

New

  • POST /v1/messages (Anthropic Messages in) — passthrough to Anthropic with prompt-cache breakpoints + keepalive; cross-converts to OpenAI for gpt-* models, including streaming. Lets Claude Code / the Anthropic SDK finally get the cache keepalive.
  • POST /v1/responses (OpenAI Responses in) — passthrough to OpenAI. Lets Codex / the Agents SDK route through the bridge.

Notes

  • Each request replies in the same format it was sent in.
  • Responses->Anthropic translation is deferred (returns a clear 400) — see issue #1.
  • 17 tests, all passing.
Loading

v1.0.0 - Initial release

31 May 16:43
@crossps crossps

Choose a tag to compare

First public release of LLM Bridge & Cache — a zero-dependency, bring-your-own-key local API bridge.

Features

  • Multi-provider routing — one OpenAI-compatible endpoint in front of OpenAI & Anthropic; route by model name or explicit provider/model.
  • Prompt-cache keepalive — keeps Anthropic's ~5-min prompt cache warm with minimal max_tokens=1 pings; multi-chat, auto-evicting.
  • Prompt injection — system prepend/append/replace + depth injection.
  • Streaming, tools, and vision translated between OpenAI and Anthropic formats.
  • Multi-key round-robin, CORS, and a /status endpoint that never exposes keys.

Requires Node >= 18.17. npx llm-bridge-cache to run.

Loading

AltStyle によって変換されたページ (->オリジナル) /