v2.1: token expiry, browser/CORS support, per-token rate limiting#2

Open

Bug-Finderr wants to merge 20 commits into

main from

v2.1

Open

v2.1: token expiry, browser/CORS support, per-token rate limiting #2
Bug-Finderr wants to merge 20 commits into
main from
v2.1

Conversation

@Bug-Finderr

@Bug-Finderr Bug-Finderr commented Jun 22, 2026

Copy link

Copy Markdown

Owner

Summary

Three KISS, key-safe, free-tier additions to the token-gated proxy, plus prep/cleanup. The real provider key never rides any new path.

Features

Token expiry (60445bd) - optional expiresAt (UTC ISO) on a token; enforced in getValidatedByHash (rejects past/malformed, fail-closed on NaN). Not KV expirationTtl (60s floor, deletes the record, orphans the :lu key). Admin dashboard gets an "Expires" field + column; expired tokens render as expired and dim.
CORS preflight + browser support (7d05d45) - handleProxy answers OPTIONS before auth (previously every browser preflight 401'd), reflects Origin on every response, and exposes the Gemini resumable-upload headers. The upload URL still passes through verbatim (bytes go client->Google; key never on that leg).
Per-token rate limiting (4c928a3) - Workers Rate Limiting binding keyed on the token hash; 429 + Retry-After: 60 on deny, fail-open so a missing/erroring binding never bricks the proxy. 100 req/60s shared ceiling (tunable). It's a per-colo, loose ceiling - abuse protection, not a strict quota.

Prep / cleanup

Renamed the token concept "doppelganger" -> "proxy token" everywhere; prefix dgk_ -> ptk_ (cosmetic; existing tokens validate by hash, unaffected) (b2dad21).
Archived the v1 per-provider proxies + schedule.sh to _legacy/ (b2dad21, bfe4d3a).
Dashboard now auto-refreshes every 10s (KV list() ~60s lag) (bfe4d3a).

Deferred (not in this PR)

Spend/usage caps (needs a metering DO + SSE usage parsing), key pools (YAGNI), concurrency / longer windows.

Testing

79 tests green (72 tier-1 in workerd + 7 tier-2 real-SDK). tsc clean.
Live-verified on the deployed Free-plan worker: expiry enforcement (expired -> 401, no upstream), CORS preflight (204) + reflected Origin, and rate limiting (confirmed the binding deploys and limit() enforces on the Free plan - the only research claim that was unverified).
Gemini remains untested with the actual API (no key yet); documented in the README.

Bug-Finderr added 5 commits

June 22, 2026 23:05

@Bug-Finderr


 chore: rename token concept to "proxy token", archive v1 proxies, pre...

b2dad21

...p v2.1
- doppelganger -> "proxy token" across docs/README/tests/comments; token prefix
 dgk_ -> ptk_ (cosmetic only; tokens validate by hash, existing ones unaffected)
- move the three v1 per-provider passthrough workers + wrangler configs to
 _legacy/v1/ (git renames) with a README; reference-only, not deployed
- document Gemini as built-but-unproven (no key yet; mock-tested only)
- align lefthook pre-commit biome flag with the lint script (--unsafe)

@Bug-Finderr


 feat: per-token expiry dates (check-at-validate)

60445bd

- optional expiresAt (UTC ISO) on TokenMetadata + CreateInput
- getValidatedByHash rejects past/malformed expiry; fail-closed on NaN
- admin: "Expires (optional)" datetime field + "Expires" column; expired
 tokens show "expired" and dim the row
- not KV expirationTtl (60s floor, deletes record, orphans the :lu key)
- tests: absent / future / past / malformed expiry

@Bug-Finderr


 feat: CORS preflight + reflect-Origin so browser SDKs work

7d05d45

- handleProxy answers the OPTIONS preflight (204) before auth checks, reflecting
 Origin + the requested headers; previously every browser preflight 401'd
- reflect Origin on every response and expose the Gemini resumable-upload headers
 (x-goog-upload-url etc.) so browser clients can read them
- Gemini upload URL still passes through verbatim (bytes go client->Google direct;
 the real key never rides that leg). Browser callers set the SDK's own opt-in.
- tests: preflight, reflected-Origin + expose-headers, no-Origin no-op

@Bug-Finderr


 feat: per-token RPM rate limiting (Workers Rate Limiting binding)

4c928a3

- after token validation, limit() keyed on the SHA-256 hash; 429 + Retry-After: 60
 on deny. Fail-open on a missing/erroring binding so it never bricks the proxy.
- wrangler [[ratelimits]] RATE_LIMITER at 100 req / 60s (one shared ceiling, KISS;
 tune freely). Verified live on the Free plan: the binding deploys and limit() enforces.
- per-colo + eventually-consistent: a loose ceiling that stops sustained abuse, not a
 strict gate (documented).
- tests: deny -> 429, allow -> forward, throw -> fail-open

@Bug-Finderr


 chore: address handoff items - archive schedule.sh, dashboard auto-re...

bfe4d3a

...fresh, docs
- move schedule.sh -> _legacy/ (archived helper; paths resolve from repo root,
 provider flags point at the archived _legacy/v1/ workers)
- dashboard: poll the token list every 10s so new tokens / lastUsed surface despite
 KV list() eventual consistency (~60s)
- README: document per-token controls (expiry, rate limit) + browser/CORS support,
 mark Gemini "untested with the actual API", point disable/enable at _legacy/

@Bug-Finderr Bug-Finderr self-assigned this

Jun 22, 2026

Bug-Finderr added 15 commits

June 22, 2026 23:45

@Bug-Finderr


 test: rename leftover DOPPEL fixtures to PROXY-TOKEN

bdd0f02

@Bug-Finderr


 docs: add docs/architecture.md, capture v2.1 learnings, drop legacy f...

1074a1c

...rom README
- docs/architecture.md: the full current design (topology, request flow, routing,
 auth swap, token model, rate limiting, CORS, OpenAI egress DO, admin, testing,
 security) - replaces the superpowers design spec
- learnings: rate-limit binding (free on Free plan, loose per-colo) and token
 expiry (check-at-validate, fail-closed)
- README: link to docs/architecture.md; drop the legacy schedule.sh section

@Bug-Finderr


 docs: replace broken ASCII diagrams in architecture.md with clean mar...

5df93ab

...kdown

@Bug-Finderr


 docs: consumer-first README + cleaner architecture.md

862c404

- README: lead with "Use it" — the libraries it works with (official OpenAI/Anthropic/
 GenAI SDKs + any standard-auth client) and how a client points at the worker; trim
 "How it works" to a brief mechanism + link to architecture.md
- architecture.md: replace ASCII pseudocode/box diagrams with numbered lists + a simple
 dispatch snippet; drop the repo-layout section (it just rots)

@Bug-Finderr


 docs: drop §17 repo layout, keep the 5df93ab diagram formatting

e084ce2

Restore the request-flow/topology formatting from 5df93ab (a prior full-file rewrite
had overwritten it) and remove the repo-layout section, which only rots.

@Bug-Finderr


 test: add raw-fetch + LiteLLM compat coverage; name sdk-compat files ...

39f4076

...after the client
- fetch.ts: raw HTTP - covers the Gemini ?key= auth slot (no SDK exercises it),
 verbatim request-body forwarding, and an end-to-end CORS preflight
- litellm.py: separate Python runner (local .venv) driving LiteLLM through the worker;
 run via `nub run test:py`, also chained into `nub run test`
- rename openai/anthropic/gemini/fetch .test.ts -> .ts (file = the client it drives);
 compat config globs test/sdk-compat/*.ts and excludes the setup.ts harness
- README documents each test as a per-client usage example + the venv setup

@Bug-Finderr


 docs: tighten architecture.md; add CORS-preflight/upload-passthrough ...

3f9abf8

...learning
- architecture.md: note auth-slot precedence over ?key=, the CORS method allow-list +
 Vary: Origin, Path=/admin on the admin cookie; drop the unsourced "~60%" figure;
 trim the §15 recap; fix the ratelimits TOML spacing
- learnings: new cors-preflight-and-upload-passthrough.md (preflight before auth; the
 Gemini upload URL is passed through, not rewritten) + note the 8-way egress DO pool

@Bug-Finderr


 refactor(test): name sdk-compat files after their packages; move requ...

f7a3af6

...irements to test/
- gemini.ts -> google-genai.ts (@google/genai), anthropic.ts -> anthropic-ai-sdk.ts
 (@anthropic-ai/sdk); openai.ts already matches its package. Avoids collision with the
 future Vercel AI SDK (@ai-sdk/*) per-provider tests (see HANDOFF).
- move requirements.txt to test/ (beside the run-py.mjs runner); update the path in
 run-py.mjs, litellm.py, requirements.txt, and the README

@Bug-Finderr


 docs: prove SDK compat is the auth slot, not the SDK/language

b3ea890

An 8-agent source-level survey (official SDKs in Python/Node/Go/Java/Ruby/.NET,
Vercel AI SDK incl. @ai-sdk/google, LangChain JS+Py, LiteLLM, LlamaIndex,
instructor, Aider/Cline/Continue/Open WebUI) confirms every client collapses
onto one of the 4 auth slots already tested; none hits a new slot or path. The
decisive case @ai-sdk/google uses the x-goog-api-key header at source (not
?key=, not Bearer), mapping to the existing gemini slot.
So per-SDK / per-language compat tests would be redundant by the proxy routing
logic - none added. Instead: new learnings doc with the proof matrix + caveats
(Anthropic OAuth Bearer mode, legacy google-generativeai gRPC default, OpenAI
/v1/responses verbatim forward), and README + architecture.md tightened to the
auth-slot-not-SDK claim.

@Bug-Finderr


 docs: fix review findings in compat learning

71c1fdf

- Drop fabricated `hitsNewProxyPath` symbol reference (it was a research
 schema field, not a codebase symbol; a reader would grep and find nothing).
- Resolve the "four slots" conflation: the table lists the four provider
 routes SDKs use (3 distinct header slots + the /v1beta/openai/ path split),
 and the `?key=` query slot (no SDK uses it, only raw HTTP) is now called out
 explicitly as the fourth slot the proxy reads.

@Bug-Finderr


 docs: make the per-provider anchor test explicit in the compat learning

c86f499

State plainly that each provider has one real-SDK anchor test (openai.ts/
litellm.py, anthropic-ai-sdk.ts, google-genai.ts/fetch.ts) and that the
by-construction claim is only valid because it extends those verified anchors -
without an anchor it proves nothing.

@Bug-Finderr


 test: add per-library compat tests; fix test:py harness (no orphaned ...

8369fd1

...workerd)
Per the clarified rule - dedup across LANGUAGES of a package, not across
packages - each distinct client library now gets one end-to-end test:
- Node: Vercel AI SDK (@ai-sdk/openai|anthropic|google), LangChain
 (@langchain/openai|anthropic|google-genai), Genkit
- Python: LlamaIndex (openai+anthropic+gemini), instructor, Pydantic AI
Each drives the real library at the worker and asserts the mock saw the
real key swapped into the right slot with the proxy token absent.
Mastra is EXCLUDED: nub flagged @mastra/core 1.x as malicious (advisory
MAL-2026-6011, embedded malicious code); not bypassed. Other-language
packages of a tested SDK, end-user apps (Aider/Cline/Continue/OpenWebUI),
and JVM/.NET frameworks stay documented as compatible-by-construction.
Fix test:py hang + workerd leak: litellm.py spawned `npx wrangler dev`
with shell=True, so terminate() orphaned workerd on Windows. Rebuilt
test/run-py.mjs to own ONE worker (unstable_dev, clean teardown) + ONE
mock with /__captured + /__reset endpoints; Python files are thin clients
reading PROXY_* env. Async spawn (spawnSync froze the mock event loop) +
a hard per-file timeout so nothing hangs.
Docs: README, architecture.md §13, and the compat learning updated to the
test-each-library / dedup-across-languages framing.

@Bug-Finderr


 test: address review - pin python deps, harden py harness

c2c334e

- Pin test/requirements.txt with ~= (lock minor): these libs shift default
 endpoints across minors, which the compat tests encode. Drop unused
 llama-index-llms-openai-like.
- run-py.mjs: match setup.ts providerFromPath (/v1beta/ fallback) so the two
 mocks do not drift; strip real provider keys (OPENAI/ANTHROPIC/GEMINI/GOOGLE)
 from the child env so the seeded proxy token is the only key in play.

@Bug-Finderr


 docs: use uv for the Python test venv setup (replaces pip)

48bc5d0

uv venv + uv pip install -r is the fast drop-in for python -m venv + pip:
uv pip install auto-targets the .venv, no activation needed. The runner's
.venv/Scripts/python path is unchanged (uv creates a standard venv).
Verified: recreated .venv with uv (Python 3.14.5), test:py all green.
Updated README, requirements.txt header, and the run-py.mjs skip hint.

@Bug-Finderr


 chore(test): bump python compat deps to latest

01b8c6f

litellm 1.83.7->1.89.3, openai 2.30.0->2.43.0, pydantic 2.12.5->2.13.4
(others already latest). Recreated .venv with uv, full suite green:
72 unit + 16 compat + 4 python.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1: token expiry, browser/CORS support, per-token rate limiting#2

v2.1: token expiry, browser/CORS support, per-token rate limiting #2
Bug-Finderr wants to merge 20 commits into
main from
v2.1

Conversation

@Bug-Finderr Bug-Finderr commented Jun 22, 2026

Summary

Features

Prep / cleanup

Deferred (not in this PR)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant