Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: diegosouzapw/OmniRoute

v3.8.24

14 Jun 00:42
@diegosouzapw diegosouzapw
b7ac403
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

[3.8.24] — 2026年06月13日

✨ New Features

  • feat(plugins): custom plugin marketplace support — the plugin registry now fetches from a custom URL set in system settings (pluginMarketplaceUrl), falling back to the local seed registry when none is configured. Adds a GET /api/plugins/marketplace endpoint and a revamped Marketplace UI. (#3656 — thanks @oyi77)
  • feat(api-keys): strict-mode controls for the Claude Code default routing pathClaude Code default is now an explicit cc/* model permission, so an API key can allow the default path while blocking specific model families (e.g. Fable) in strict mode. Previously the default path received dynamic/unprefixed models (sonnet, opus, claude-opus-4-8[1m], ...) that no single permission represented, so it broke under strict permissions. (#3776 — thanks @Witroch4)
  • feat(flags): expose the emergency budget fallback in the Feature Flags pageOMNIROUTE_EMERGENCY_FALLBACK is now a runtime boolean (enabled by default, applied without restart) resolved through the feature-flag stack, so a DB override can toggle it while still honoring the raw env fallback. Follow-up to #3741 by @zoispag. (#3752 — thanks @rdself)
  • feat(reasoning): preserve xhigh reasoning effort by defaultxhigh now passes through unless a model explicitly sets supportsXHighEffort: false, with the existing max normalization kept separate. (#3756 — thanks @rdself)
  • feat(codex): inject OmniRoute memory into Codex Responses WebSocket requests — retrieved memory is injected into the Responses WebSocket prepare request via the instructions field, with the retrieval query derived from the latest user input (skipping tool/reasoning payloads) and duplicate-safe injection. (#3749 — thanks @kkkayye)
  • feat(dashboard): free provider rankings page — a new dashboard page (with sidebar entry) that ranks free providers (no-auth / OAuth / API-key) by their models' Arena ELO / intelligence scores, joining the provider registry with the model_intelligence table via fuzzy model-name matching. Pure computation over existing data — no external calls. (#3799 — thanks @pizzav-xyz)

🔒 Security

  • security(proxy): IPv6-only egress enforcement + closing IP-leak paths (L1/L2/L3) — de-brackets IPv6 literals at the SOCKS host and the proxy-health tcpCheck (so socks5://[2001:db8::1] and any v6-literal proxy connect instead of dying with ENOTFOUND), adds a per-proxy family policy (auto/ipv4/ipv6), and enforces it end-to-end across SOCKS5/HTTP/HTTPS ×ばつ global/provider/key ×ばつ literal/hostname. 16 commits, TDD+BDD, 73 tests. (#3777)
  • security(marketplace): harden the custom-URL SSRF guard against three bypasses found by automated security review — IPv6/AAAA records (only dns.resolve4 was checked, so a private AAAA record or an IPv6 literal slipped through), redirect-following (a public URL could 30x to an internal one), and DNS-rebinding TOCTOU. The guard now resolves A+AAAA via the canonical isPrivateHost, routes the fetch through safeOutboundFetch (public-only, blocks redirects to private hosts), and re-validates on fetch. Reachable only with management auth + a custom pluginMarketplaceUrl. Follow-up to #3656. (#3774)
  • security: resolve all open CodeQL + Dependabot alerts — CodeQL js/insufficient-password-hash (the semantic-cache apiKeyId is now a plaintext key prefix, ${apiKeyId}.${digest}, instead of being folded into the SHA-256 digest, clearing the false positive while preserving per-key cache isolation) and a URL-substring check tightened to an exact host match; Dependabot esbuild < 0.28.1 pinned via override in both workspaces. (#3778) The remaining js/incomplete-url-substring-sanitization instances in the api-key proxy-context test were also cleared by asserting on the parsed URL host/port instead of a substring includes. (thanks @diegosouzapw)

🐛 Fixed

  • fix(dashboard): surface the Plugins page (plugin manager + marketplace) in the sidebar — the plugins page (/dashboard/plugins), which hosts the custom plugin marketplace shipped in #3656, had no menu entry and was reachable only by typing the URL. It now appears under Agentic Features. (thanks @diegosouzapw)
  • fix(proxy): add the IP-family selector (auto / IPv4-only / IPv6-only) to the proxy form — the per-proxy family egress policy from #3777 was backend-only (the dashboard had no control, so every proxy stayed on auto). The proxy registry form now exposes the selector and the create/update schema accepts it, so IPv6-only egress can be enabled from the UI. (thanks @diegosouzapw)
  • fix(combo): deep audit of the combo + quota-shared routing system — repairs 5 dead/broken rules (streaming-USD cost recording, quota-pool usage provider resolution, provider-diversity wiring, maxComboDepth threading, and scoring clamp/NaN-safety incl. connectionDensity) and revives the dead tierAffinity/specificityMatch scoring factors — root cause was a require() that throws under ESM, so both factors silently collapsed to 0.5; now a static import. Validates every auto-router strategy (cost / latency / sla-aware / lkgp / selectWithStrategy + aliases) and the predictive-TTFT decision, adds E2E coverage (3-hop priority failover, per-target timeout failover, real strategy:auto dispatch), and introduces opt-in complexity-aware routing (2026) layered over the existing specificity detector. Per-target credential+proxy isolation verified clean (AsyncLocalStorage). 4 TDD waves, 10 new/updated test files. (#3779 — thanks @diegosouzapw)
  • fix(anthropic): normalize sampling params under extended thinking — Claude models with extended thinking (e.g. Opus 4.8 via the Claude Code provider) returned HTTP 400 when a request carried non-default temperature/top_p (temperature may only be set to 1 ..., top_p must be ≥ 0.95 or unset ...). Tools like VS Code Copilot's "Ollama" BYOK send temperature: 0.7 + top_p: 0.9, so every thinking-enabled Claude request failed; the proxy now drops/normalizes these params at the chokepoint so the request succeeds. (#3780 — thanks @zhiru)
  • fix(sse): pass Claude passthrough thinking blocks through unchanged — the Anthropic-native Claude OAuth passthrough rewrote every assistant thinking block to redacted_thinking, which the Messages API rejects (submitted thinking blocks are validated against the original response), so every multi-turn request with extended thinking failed with 400 ... thinking blocks ... cannot be modified (very visible on long Claude Code tool-loops). The blocks are now passed through verbatim; the signature is validated server-side and stays valid on replay (including across an OAuth token switch), so the redaction was unnecessary. (#3775 — thanks @havockdev)
  • fix(mcp): resolve the bundled MCP server entry from dist/ instead of the legacy app/ path — omniroute --mcp crashed on npm installs with ERR_MODULE_NOT_FOUND: Cannot find package @/lib because bin/mcp-server.mjs looked for the compiled entry under app/ (a VPS-deploy path that never exists in the npm package) and fell back to the un-bundled .ts. (#3765 — thanks @megamen32)
  • fix(sse): preserve streamed tool-call arguments end-to-end — incremental tool-call argument deltas could be truncated/duplicated through SSE parsing, transformation and response translation, corrupting tool calls in CLI tool-use output. Dedup now only collapses unambiguous snapshots. (#3762, closes #3701 — thanks @Mffff4)
  • fix(dashboard): repair the Logs page light-mode controls — the "Clean history" button keeps readable contrast (while preserving the red destructive affordance), request-row hover uses a cool blue tint so it no longer reads like a failed-request row, and the custom auto-refresh interval persists in localStorage (clamped to 1–300s). Also refreshes the Feature Flags light-mode treatment. (#3760 — thanks @rdself)
  • fix(dashboard): make the Request Logs "Clean history" action perform a full request-history purge — clears call_logs, legacy request_detail_logs, and local JSON artifacts under DATA_DIR/call_logs (including orphaned artifact files) via a dedicated maintenance API route, instead of a retention-only cleanup. (#3751 — thanks @rdself)
  • fix(cli): detect CLI tools installed outside the GUI PATH on macOS. macOS GUI/Electron apps don't inherit the user's login-shell PATH, so Homebrew (/opt/homebrew/bin), nvm and volta-installed CLIs (Cline, Codex, OpenCode, Continue, Hermes, ...) were reported "not installed" and the Cline runtime couldn't be spawned. CLI detection (omniroute doctor) and the provider-runtime lookup now enrich the lookup PATH with the login shell's PATH ($SHELL -ilc, darwin-only, cached, fail-safe). (#3321 — thanks @mikm...
Read more
Assets 13

v3.8.23

13 Jun 02:50
@diegosouzapw diegosouzapw
de60b4b
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

[3.8.23] — 2026年06月13日

✨ New Features

  • Emergency budget fallback: opt-out env switch OMNIROUTE_EMERGENCY_FALLBACK (#3741 — thanks @zoispag): adds an OMNIROUTE_EMERGENCY_FALLBACK environment variable that disables the budget-exhaustion emergency reroute to nvidia/openai/gpt-oss-120b entirely when set to false or 0. Default behavior (enabled) is unchanged.

  • Auto-Combo: live model intelligence scoring via Arena ELO + models.dev (#3660 — thanks @pizzav-xyz): replaces the static fitness lookup with a 5-layer resolution chain (user override → Arena ELO → models.dev tiers → hardcoded map → neutral fallback). A sync pipeline auto-fetches Arena AI leaderboard ELO scores and derives intelligence tiers from models.dev capabilities; combo picks now update as leaderboard rankings change without any manual configuration.

  • Vertex AI: dynamic model discovery (#3712 — thanks @artickc): the vertex provider now queries the Generative Language models API at runtime to surface the full account catalog — including image-generation models (Imagen, gemini-*-image), embeddings, and partner models — instead of returning only the small hardcoded registry list.

  • Vertex AI: self-tracked USD spend on the Limits page (#3724 — thanks @artickc): since the Google Cloud Billing API is inaccessible via the proxy credential, Vertex connections now track their own cumulative USD spend locally (based on token-cost accounting) and display it on the Limits page as "$ used since account added."

  • Gemini: rate-limit metadata for known per-model RPM/RPD caps (#3686 — thanks @hartmark): injects known rate-limit headers (RPM/RPD) for Gemini models that carry per-model limits (e.g. Gemma 4's 15 RPM / generous RPD), so the cooldown engine applies them correctly instead of locking out the whole account on daily-limit hits.

  • Model Lockout: full settings UI with success-decay recovery (#3629 — thanks @Chewji9875): end-to-end wiring of the per-model lockout feature — settings UI (enable/disable, configure thresholds), backend integration, structured error classification, and a success-decay mechanism that gradually recovers a locked model's fitness as successful calls accumulate. Lockout now applies to all providers when enabled, not just per-model-quota providers.

  • Provider display modes — All / Configured / Compact (#3743 — thanks @rdself): adds a three-state display mode control to the Providers page. "All" shows every registered provider; "Configured" shows only providers with at least one connection; "Compact" shows configured providers in a condensed card layout for denser views.

  • API key cost drilldown + quota % used (#3742 — thanks @Witroch4): the API Keys page now shows a per-key cost breakdown and the percentage of quota consumed for each key.

🔧 Bug Fixes

  • @omniroute/opencode-plugin bundled in the npm tarball + omniroute setup opencode CLI command (#3726 — thanks @herjarsa): the plugin was never compiled as part of the publish pipeline, requiring manual extraction. Now ships pre-built inside the omniroute package and installed via omniroute setup opencode (copies plugin into ~/.config/opencode/plugins/omniroute/, updates opencode.json idempotently). Also fixes provider.models baseURL resolution — checks _provider.options.baseURL as a third fallback so partner/tiered providers no longer return zero models. (#3711)

  • MiMoCode 403 "Illegal access" fixed (#3728 — thanks @felipesartori): the Xiaomi free endpoint gates requests on a recognized MiMoCode system-prompt signature; OmniRoute forwarded raw requests without the marker, causing 403 on every call. The executor now injects the required anti-abuse signature.

  • "Test all models" flow: i18n crash, status icons, auto-hide (#3729 — thanks @felipesartori): three bugs in the provider-detail test-all-models flow — providerText() crash because the testAllResults template requires {ok, total} but callers passed {ok, error}; missing online/offline status icons on model rows; results panel not auto-hiding after run completes.

  • OAuth token-refresh invalidation loop fixed (#3692 — thanks @diegosouzapw): refreshClaudeOAuthToken returned null instead of the error sentinel on non-canonical 400 bodies, causing the caller to retry every 60 seconds — observed as 1,352 consecutive refresh attempts on one Claude account. Fixed alongside hardening of safeResolveProxy (proxy resolution errors now warn instead of silently falling back to DIRECT) and adding egress-IP visibility to safeLogEvents.

  • safeLogEvents async hotfix (thanks @diegosouzapw): PR #3692 introduced a lazy await import(proxyEgress) inside a sync safeLogEvents — an ES syntax error that broke every consumer loading chatHelpers via tsx and caused 14 tests to fail at module load. Made safeLogEvents async; void-ed the single chat.ts call site.

  • Kiro: quota tracking for IAM Identity Center accounts (#3722 — thanks @artickc): getKiroUsage returned "0 used" for IAM Identity Center accounts (and kiro-cli imports) because those connections frequently lack a persisted profileArn. Now falls back to a name-based profile lookup so quota displays correctly.

  • Empty Claude SSE stream now surfaces a real error (#3689 — thanks @TechNickAI): when a Claude stream completed with lifecycle events but no content block, the proxy returned a synthetic "[Proxy Error] The upstream API returned an empty response" as a successful assistant message. Now emits a proper SSE error event; the missing-finalizer synthetic path is preserved for streams that already produced content.

  • Vertex AI Express-mode API keys (#3690 — thanks @artickc): the Vertex executor rejected every non-JSON credential with "Vertex AI requires a valid Service Account JSON." Now accepts Express-mode API key strings (AIza*) alongside Service Account JSON, routing them through the correct token endpoint.

  • Anthropic: strip top_p when temperature is set (#3691 — thanks @zhiru): Anthropic API rejects requests containing both temperature and top_p; VS Code's Claude extension sends both in every request, causing 400s on all routed calls. The OpenAI→Claude translator now drops top_p when temperature is present.

  • Combo reasoning token buffer: conservative application + feature flag (#3700 — thanks @rdself): tightens the #3588 buffer (only applies when the model is explicitly thinking-capable, has a non-default known output cap, and the full buffered value fits inside that cap) and adds a reasoningTokenBufferEnabled feature flag in combo defaults so users can fully disable it from Settings.

  • Emergency budget fallback: cross-provider credential leak fixed (#3699 — thanks @diegosouzapw): the executor-level emergency hop re-sent the failing provider's API key to the emergency provider's endpoint (e.g. the OpenAI Authorization header going to integrate.api.nvidia.com). Now orchestrated exclusively by the routing layer, which resolves credentials for the emergency provider via account selection and no longer fires inside combo targets.

  • /v1/messages/count_tokens now honors the connection's proxy assignment (#3699 — thanks @diegosouzapw): token count calls went DIRECT regardless of configured proxies, leaking the host IP for proxy-isolated setups. Now wraps execution in runWithProxyContext, exactly like chat execution.

  • Gemini: context-mode fallback for signatureless tool calls (#3688 — thanks @diegosouzapw): fixes HTTP 400 on multi-turn thinking-model tool calls when thought_signature is unavailable — standard Gemini provider now falls back to context mode instead of sending the unsigned call.

  • Antigravity: preserve gemini-3.1-pro High/Low budget tiers (#3696 — thanks @diegosouzapw): upstream accepts the suffixed ids; stop collapsing to bare gemini-3.1-pro.

  • Stream combo: fail over on empty/content-filtered response (#3685 — thanks @diegosouzapw): streaming combos now route to the next target instead of surfacing a blank reply.

  • Qwen Web: migrated to v2 chat API (#3723 — thanks @diegosouzapw): the legacy /api/chat/completions endpoint was retired upstream returning 504 HTML from Alibaba's gateway for all requests. The executor now uses the two-step v2 flow (/api/v2/chats/new/api/v2/chat/completions?chat_id=), replays the full browser cookie jar (cna + ssxmod_itna/itna2 + token) required by Alibaba's WAF instead of only a Bearer token, parses phase-based SSE (think→reasoning, answer→content), and refreshes the model catalog to current ids (qwen3.7-max, qwen3.7-plus, `...

Read more
Loading

v3.8.22

12 Jun 01:08
@diegosouzapw diegosouzapw
b6c65ef
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

✨ Added

  • MiMoCode free-tier provider ([#3659] — thanks @pizzav-xyz): new no-auth provider mimocode (alias mcode) exposing Xiaomi's mimo-auto model (1M context) via device-fingerprint bootstrap-JWT auth (/api/free-ai/bootstrap → Bearer JWT → /api/free-ai/openai/chat). Supports multiple accounts (N fingerprints → round-robin with exponential cooldown), re-bootstrap on 401/403, and cooldown on 429. Reuses a new generic NoAuthAccountCard dashboard component (also wired for opencode). 22 unit tests; upstream validated live during review. (Maintainer follow-up: added the required authHeader: "none" field to the registry entry.) Co-authored with @pizzav-xyz.
  • Prefer Claude Code for unprefixed claude-* model IDs ([#3540] — thanks @Witroch4): opt-in setting (default off) that routes bare claude-* model IDs from Claude Code clients through the Claude Code OAuth account instead of requiring a provider prefix. Configurable via the OMNIROUTE_PREFER_CLAUDE_CODE_FOR_UNPREFIXED_CLAUDE_MODELS env flag or a dashboard toggle on the Claude provider page; explicit provider prefixes still win. Full layer coverage (resolver + DB setting + zod schemas + types + UI) with 6 tests. Co-authored with @Witroch4.
  • Codex Responses-WebSocket call history ([#3616] — thanks @kkkayye): Codex /v1/responses WebSocket calls are now persisted to request history — success completions plus prepare-failures, upstream WS errors and premature closes — with sanitizeErrorMessage applied to the stored error. Two proxy-side integration tests cover the success and failure paths.
  • Obsidian/WebDAV: add the /api/v1/webdav file server (PROPFIND/GET/PUT/DELETE/MKCOL/MOVE, Basic-Auth, path-traversal hardened) so Obsidian mobile can sync the vault (#3485, part 2). Implemented in the custom server layer (scripts/dev/webdav-handler.mjs) — intercepted before Next.js to support non-standard HTTP methods (PROPFIND, MKCOL, MOVE, LOCK). Reads vault path and credentials (with enc:v1: AES-256-GCM decryption) directly from the SQLite key_value table; credentials configured via PR1's /api/settings/obsidian/webdav endpoint. 36 TDD unit tests covering traversal guard, constant-time auth, decrypt round-trip, XML generation, and full CRUD cycle.
  • Quota overview: deactivate/activate an account directly from the quota card header (toggle button) so users can park a near-zero-quota account without navigating to the provider detail page. (#3675 — thanks @leninejunior)

♻️ Code Quality

  • providers/[id]: extract useProviderConnections, useProviderSettings, useProviderModels hooks from the god-component — #3501 Phase 1f. ProviderDetailPageClient.tsx: 4,948 → 4,063 LOC (−885 lines). New hooks in hooks/: useProviderConnections.ts (954 LOC — all connection management, batch ops, proxy/CLIProxyAPI state, batch-test runner with MAX_BULK_IDS chunking), useProviderSettings.ts (264 LOC — Codex global service mode + Claude routing preference), useProviderModels.ts (155 LOC — model metadata, aliases). Frozen baselines updated. 10 Phase-1f smoke tests; typecheck/cycles/lint green. Co-authored with @oyi77.

  • providers/[id]: extract useModelCompatState hook + model sections (ModelRow, PassthroughModelRow, PassthroughModelsSection, CustomModelsSection, CompatibleModelsSection) from the god-component — #3501 Phase 1e. ProviderDetailPageClient.tsx: 6,838 → 4,922 LOC (−1,916 lines). New leaf hooks/useModelCompatState.ts (101 LOC); compat helpers moved to providerPageHelpers.ts. Frozen baselines: providerPageHelpers.ts: 822. 12 Phase-1e smoke tests; typecheck/cycles/lint green; #3610 auto-hide fix preserved.

  • providers/[id]: extract ConnectionRow (+ CooldownTimer/inferErrorType/getStatusPresentation), ModelCompatPopover (+ recordToHeaderRows), and SiliconFlowEndpointModal from the god-component into components/#3501 Phase 1d. ProviderDetailPageClient.tsx: 8,092 → 6,838 LOC (−1,254 lines). Frozen baselines: ConnectionRow.tsx: 941. 7 new Phase-1d smoke tests; typecheck/cycles/lint green.

  • providers/[id]: extract AddApiKeyModal + EditConnectionModal (+ WebSessionCredentialGuide) from the god-component into components/ ([#3501] Phase 1c): extracted the two heaviest inline modals — AddApiKeyModal (~787-LOC body) and EditConnectionModal (~1091-LOC body) — plus shared WebSessionCredentialGuide (~103 LOC) into standalone files under providers/[id]/components/modals/ and providers/[id]/components/ respectively. Added ERROR_TYPE_LABELS and formatTimeAgo to providerPageHelpers.ts (leaf) so EditConnectionModal and ConnectionRow share them without cycles. Pruned 14 now-unused imports from the god-component. ProviderDetailPageClient.tsx: 9,981 → 8,092 LOC (−1,889 lines). Frozen baselines: AddApiKeyModal.tsx: 842, EditConnectionModal.tsx: 1170. 6 new Phase-1c smoke tests; all 21 vitest modal tests pass; typecheck/cycles/lint green.

  • refactor: small db/utils cleanup ([#3523] — thanks @androw): table-driven compression_analytics column migration (replaces 17 repeated ALTER TABLE calls), a single merged serializeJsonField helper in db/providers.ts (folded two byte-identical serializers), and removal of the dead no-op syncProviderDataToCloud/getProvidersNeedingRefresh stubs from shared/utils/machine.ts (no remaining callers). Pure refactor; behavior unchanged.

  • Provider-detail god-component decomposition — Phase 2b (remaining shared helpers→leaf) ([#3501]): extended providers/[id]/providerPageHelpers.ts with all remaining pure helpers needed by the heavy modals (AddApiKeyModal/EditConnectionModal) before they can be extracted. Moved 22 symbols: web-session credential label/hint/check/title helpers; upstream-headers helpers (upstreamHeadersRecordsEqual, headerRowsToRecord, effectiveUpstreamHeadersForProtocol, anyUpstreamHeadersBadge, getProtoSlice) plus their HeaderDraftRow/CompatModelRow/CompatModelMap/CompatByProtocolMap types; Codex consts and helpers (CODEX_REASONING_STRENGTH_OPTIONS, CODEX_ACCOUNT_SERVICE_TIER_VALUES, CODEX_GLOBAL_SERVICE_MODE_VALUES, getCodexServiceTierLabel, normalizeCodexLimitPolicy, getCodexRequestDefaults, getClaudeCodeCompatibleRequestDefaults); misc helpers (compatProtocolLabelKey, extractCommandCodeCredentialInput, normalizeAndValidateHttpBaseUrl, SILICONFLOW_ENDPOINTS, CommandCodeAuthFlowState). New transitive imports wired into the leaf: MODEL_COMPAT_PROTOCOL_KEYS (@/shared/constants/modelCompat), CodexServiceTier/getCodexRequestDefaults/getClaudeCodeCompatibleRequestDefaults (@/lib/providers/requestDefaults), CodexGlobalServiceMode (@/lib/providers/codexFastTier), WebSessionCredentialRequirement (./webSessionCredentials). ProviderDetailPageClient.tsx: 10,288 → 9,980 LOC. Leaf module: 589 LOC (acyclic). 25-assertion unit test suite passes; smoke test 3/3; no import cycles. Co-authored with @oyi77.

  • Provider-detail god-component decomposition — Phase 2 (helpers→lib) ([#3501]): extracted the pure shared helpers — ProviderMessageTranslator/LocalProviderMetadata types, providerText/providerCountText/readBooleanToggle, and the provider base-URL + routing-tag/excluded-model parse/format block — into a new leaf providers/[id]/providerPageHelpers.ts (imports only @/shared, so the client and modals share them with no import cycle). ProviderDetailPageClient.tsx: 10,435 → 10,288 LOC. Unblocks extracting the heavier AddApiKeyModal/EditConnectionModal (which depend on these helpers) without cycling. The Phase 0 smoke test caught a missing transitive import (isSelfHostedChatProvider) at mount — now wired + locked by a new helpers unit test (12 assertions). Co-authored with @oyi77.

  • #3500 fully resolved — Hard Rule #5 (no raw SQL in route handlers): all 13 internal offenders migrated to src/lib/db/ modules across slices (calllogs, usage_history/daily_usage_summary, community_servers, usage_logs, semantic_cache, proxy_logs, skills UPDATE, db-backups). The gate's KNOWN_RAW_SQL set is renamed to EXTERNAL_DB_ALLOWED (with a back-compat alias) and now holds only the 2 external-DB reads (oauth/cursor/auto-import, oauth/kiro/auto-import) — these open _another app's SQLite to import credentials, so by design they cannot live in OmniRoute's db/ domain. The gate still blocks any NEW raw SQL against OmniRoute's DB.

  • chore(db-gate): reclassify KNOWN_UNEXPORTEDINTENTIONALLY_INTERNAL in scripts/check/check-db-rules.mjs ([#3499]): a full audit of all 25 db modules confirmed each is consumed via direct/dynamic import per Hard Rule #2 ("Never barrel-import from localDb.ts"). The old framing labelled them as "debt", which was misleading — they are the correct pattern. The gate's blocking behaviour is unchanged (a NEW unexported module still fails); only the name, comments, and per-module justifications were updated to reflect audited truth. Four modules flagged DEAD? (compressionScheduler, discovery, pluginMetrics, prompts) have zero production importers and are documented as schema-reserved. A new regression-guard test (tests/unit/check-db-rules-classification.test.ts) asserts every non-dead module in the set has ≥1 real importer, so a future consumer removal surfaces as a test failure requiring explicit reclassification.

  • refactor(db): move call_logs aggregations into callLogStats db module ([#3500]): extracted raw SQL from three route handlers (/api/provider-metrics, /api/search/stats, /api/v1/search/analytics) into a new src/lib/db/callLogStats.ts domain module (getProviderMetrics, getSearchProviderStats, getRecentSearchLogs, getSearchAggregateStats, getSearchProviderCounts). First slice of #3500 (call_logs cluster). Behavior unchanged; the three routes are removed from KNOWN_RAW_SQL in th...

Read more
Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.21

11 Jun 08:02
@diegosouzapw diegosouzapw

Choose a tag to compare

✨ Added

  • feat(cli): omniroute autostart now accepts the shorthand the headless / omniroute serve path was missing — omniroute autostart on / ... true (aliases of enable), ... off / ... false (aliases of disable), a new ... toggle, and a default ... status (bare omniroute autostart is a safe read-only). Previously autostart could only be toggled from the tray (serve --tray) or the Electron Appearance tab, so a plain omniroute serve user had no way to enable it. (The cross-platform launchd/systemd/registry logic is unchanged — this only wires the ergonomic CLI surface.) (#3331 — thanks @uniQta)

♻️ Code Quality

  • refactor(chatCore): extract the chatCore request phases — idempotency check, semantic cache check, common request sanitization, and memory/skills injection — into dedicated open-sse/handlers/chatCore/ modules (idempotency.ts, semanticCache.ts, sanitization.ts, memorySkillsInjection.ts), slimming the monolithic handler with no behavior change. (Maintainer follow-up: re-derive idempotencyKey at the Phase 9.2 save site after the check moved into the module, fixing a ReferenceError on successful non-cached responses.) (#3598 — thanks @oyi77)
  • docs(opencode-provider): soft-deprecate @omniroute/opencode-provider in favour of @omniroute/opencode-plugin. The provider package writes a static model list to opencode.json that drifts behind the live OmniRoute catalog, whereas the plugin fetches /v1/models at OpenCode startup. The package keeps working (no code/behavior change), but its npm description and README now carry a deprecation banner with the one-line migration, and a guard test pins the notice. (#3419 — thanks @herjarsa)
  • chore(review): pre-release hardening from a multi-reviewer /review-reviews battery over the v3.8.21 diff (7 Opus reviewers; zero blocker/high). Resolved findings: npm tarball no longer ships co-located test files (files[] negations + reconciled .npmignore; the #3578 closure gate now asserts the real npm pack output in both directions); getSanitizedCachedProviderLimitsMap scopes its connection scan to antigravity/agy instead of decrypting every active connection on each dashboard poll; the Antigravity quota-tier remap (toClientAntigravityQuotaModelId) is centralized in antigravityModelAliases.ts (was an inline if-ladder in usage.ts); the chatCore idempotency check returns its resolved key so the save site reuses a single derivation; and new tests pin the chatCore extracted modules, the Antigravity usage_history fallback contract, the reasoning-wrapper prefix-preservation heuristic, the Antigravity SSE markdown branch, and the upstream-ca/test no-persist guarantee. (Live-verified that agy consumer tokens are accepted by the non-daily cloudcode-pa host used by retrieveUserQuota, so #3604 is not agy-host-limited.)

🔧 Bug Fixes

  • fix(routing): reasoning models (deepseek-v4-flash, nemotron, etc.) no longer return empty content in combo routing when they spend all of max_tokens on reasoning — validateResponseQuality now rejects an empty-content-but-reasoning_content response when reasoning consumed ≥90% of completion tokens (so the combo loop retries/falls back), and reasoning models receive a max_tokens buffer (+50%, +1000 floor) so reasoning and content both fit. (Maintainer follow-up: the round-robin buffer is applied to a per-attempt copy so it does not compound across models/retries — 4096 → 6144 → 9216 → ....) (#3588 — thanks @herjarsa)
  • fix(routing): a valid max_tokens-truncated upstream response is no longer misclassified as empty content and rewritten into a fake 502 — isEmptyContentResponse() flagged any Claude content:[] / OpenAI empty-choice payload regardless of stop_reason/finish_reason, so a Claude Code max_tokens: 1 connectivity ping (HTTP 200, stop_reason:"max_tokens", empty content) became a synthetic 502 "Provider returned empty content" and triggered a needless family fallback. The guard now treats a terminal truncation/tool signal (Claude stop_reason max_tokens/tool_use, OpenAI finish_reason length/tool_calls) as a legitimate completion; genuinely empty responses (no terminal reason, or stop/end_turn with empty content) are still caught. (#3572)
  • fix(api): /v1/completions now returns the legacy OpenAI Completions shape (object:"text_completion", choices[].text) instead of chat payloads (choices[].message|delta.content) — the endpoint routes internally through the chat pipeline, so legacy Completion clients like TabbyML's openai/completion backend crashed with missing field "text". The response (both non-streaming JSON and the SSE stream) is now translated back to the text-completion shape; [DONE] and error bodies pass through unchanged. (#3571)
  • fix(usage): the z.ai/GLM coding-plan quota card no longer shows "Monthly 0%" — coding plans have no monthly cap (only 5-hour windows), so the quota API reports the TIME_LIMIT ("Monthly") entry with total=0, and the total>0 ? ... : 0 fallback rendered a misleading 0% remaining (which can skew downstream model-choice). With no absolute cap the remaining percentage now falls back to the percentage-derived value (full/100% when 0% used). (#3580)
  • docs(discovery): mark DISCOVERY_TOOL_DESIGN.md's API Endpoints table with an explicit "⚠️ Not yet implemented — Phase 2" banner — the discovery routes are a design proposal (Phase-1 stub only), and the banner makes clear the KNOWN_STALE_DOC_REFS gate suppression is intentional, not stale drift. (#3498)
  • fix(agent-bridge): add the missing POST /api/tools/agent-bridge/upstream-ca/test route — the UpstreamCaField "Test" button POSTed to it but it didn't exist (404). The new validate-only route checks the CA file exists and is a parseable PEM certificate (returns the subject/expiry) without persisting the path or activating it; it inherits the /api/tools/agent-bridge/ LOCAL_ONLY classification. (#3488)
  • fix(gamification): the dashboard Profile page no longer hits three 404s — added the missing GET /api/gamification/{level,badges,badges/earned} routes (management-scoped). The page is operator-wide (no apiKeyId), so level/badges/earned aggregate across all keys (with an optional ?apiKeyId for a single key), and badges seeds the built-in catalog first (idempotent) so the grid is populated even on installs that never seeded it (see #3472). (#3484)
  • security(oauth): migrate the five public OAuth client_ids (Claude, Codex, Qwen, Kimi, GitHub Copilot — 9 server-side call-sites in providerRegistry.ts + oauth.ts) from string literals to resolvePublicCred() (Hard Rule #11), matching the existing Gemini/Antigravity pattern. The values decode byte-for-byte to the same public client_ids (env overrides still win), so OAuth flows are unchanged; the check-public-creds allowlist is now empty. The browser-bundled codexDeviceFlow.ts copy stays a literal by necessity (it cannot import open-sse). (#3493)
  • fix(mcp): omniroute --mcp no longer crashes on npm installs with ERR_MODULE_NOT_FOUND (e.g. src/lib/combos/steps.ts) — the MCP server runs from raw TypeScript and imports across src/ + open-sse/, but the published files allowlist only shipped a handful of cherry-picked paths, so the transitive closure (~400 files) was absent from the tarball. files now ships the backend source the MCP server needs (open-sse/ + src/{domain,lib,mitm,server,shared,sse,types}/, excluding the src/app UI), and a new regression test computes the MCP import closure and fails if any reachable source file is not covered by files. (#3578)
  • fix(api): API_REFERENCE.md no longer documents a non-existent /api/guardrails* / /api/shadow* surface (doc-fiction flagged by check-docs-symbols, frozen in KNOWN_STALE_DOC_REFS). The guardrail pipeline is real (src/lib/guardrails), so the two routes that map to actual behavior are now implemented — GET /api/guardrails (list the registered guardrails + status) and POST /api/guardrails/test (dry-run the pre-call pipeline over a sample input), both management-scoped — while the fictional enable/disable/logs rows and the entire /api/shadow* table (shadow A-B comparison is combo-config + /api/combos/metrics) were removed from the doc and dropped from the allowlist. (#3496)
  • fix(agent-bridge): the MITM "Start" button no longer reports a misleading "port 443 may be in use" for every failure cause — startMitm() only matched the EADDRINUSE stderr line and always threw the port-443 message, so a missing ROUTER_API_KEY or an EACCES permission error sent users debugging the wrong thing. The startup watcher now buffers the MITM child's stderr and interpretMitmStartupError() maps the real server.cjs cause (port-in-use / permission-denied / missing API key / any other diagnostic line) into the surfaced error; with no captured output it stays generic instead of guessing port 443. (#3606)
  • fix(oauth): Kiro "Import Token" no longer reports a bare Internal server error that hides the real cause — the import validates/refresh...
Read more

Contributors

dhaern, diegosouzapw, and 3 other contributors
Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.20

10 Jun 19:28
@diegosouzapw diegosouzapw
d6f008c
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

✨ New Features

  • feat(providers): add Claude Fable 5 (claude-fable-5) — wires the new flagship model across the full pipeline: cc and kiro provider registries (1M context, 128k output), pricing at 15ドル/75ドル per 1M tokens, model spec (adaptive thinking, vision, tool use), fast mode, 1M-context beta header, fallback chain (claude-fable-5 → claude-opus-4-8 → claude-opus-4-7 → claude-sonnet-4-6), and cost data. (#3524 — thanks @ggiak)
  • feat(resilience): add global provider cooldown tracking to prevent combo re-walking — after a provider fails in a combo request, subsequent requests skip it for a configurable exponential backoff (default 5s min, 5min max, doubling per failure), reducing wasted time on known-failing providers. Configurable and opt-out via Settings → Resilience. (#3556 — thanks @pizzav-xyz)
  • feat(resilience): expose provider breaker degradation threshold setting — the consecutive-failure count before a provider enters the DEGRADED state is now configurable in Settings → Resilience alongside the existing open/half-open thresholds. (#3535 — thanks @rdself)

🔧 Bug Fixes

  • fix(translator): scope the Gemini thoughtSignature bypass to the Antigravity/CLI path and unwrap array-shaped Gemini error bodies — signature-less historical tool calls on Antigravity/CLI are emitted as native parts carrying the skip_thought_signature_validator sentinel (preventing upstream 400s), while the standard Gemini direct path keeps its existing text/context representation untouched. (#3560 — thanks @oyi77 and @Six7Day via #3414)

  • fix(routing): combo model substitution no longer forwards a client thinking:{type:"disabled"} to a target model that rejects it — when a combo/route swaps the upstream model (e.g. claude-opus-4-8claude-fable-5), OmniRoute now strips the now-invalid thinking.type:"disabled" for models flagged rejectsThinkingDisabled (Fable 5 defaults to adaptive and rejects it), preventing the upstream 400 that silently broke Claude Code's internal title/name-generation calls. Models that accept disabled (opus/sonnet) are untouched. (#3554)

  • fix(usage): the budget dashboard can now save a budget with some limit fields left empty and clear all limits — setBudgetSchema used .positive() (rejecting the 0 the form sends for blank fields) plus a superRefine requiring at least one limit > 0, so saving with one field filled 400'd and clearing all limits was impossible. Limits now accept 0 (= "no limit for this period"; enforcement only kicks in above 0) and the cross-field minimum was removed; negatives are still rejected. (#3537)

  • fix(gamification): badge-unlock events no longer re-fire on every request — the "already unlocked?" guard used getBadges(), which INNER-JOINs badge_definitions (empty until seeded), so it always reported "not earned" and re-emitted events.badge_unlocked per request. Added a hasBadge() helper that reads user_badges directly, so dedup is correct regardless of whether definitions are seeded. (#3472)

  • fix(routing): the auto model keyword now works on the Codex /v1/responses path — resolveResponsesApiModel rewrote the bare auto keyword to codex/auto, which ChatGPT rejects (The 'auto' model is not supported when using Codex with a ChatGPT account). auto (OmniRoute's zero-config auto-routing keyword) now passes through untouched so combo routing handles it. (#3509)

  • fix(cli-tools): saving the OpenCode/CLI tool config no longer 400s in cloud mode — every CLI tool card posts apiKey: null (the real key is resolved server-side from keyId), but guideSettingsSaveSchema used z.string().optional(), which rejects null. The schema now normalizes nullundefined, so the save succeeds and the keyId/default path is used. (#3552)

  • fix(catalog): PublicAI is no longer miscatalogued as keyless/free — it requires an API key (registry authType:"apikey"; signup grants a one-time credit, then it bills). The three PublicAI models moved from freeType:"keyless" (which could pick them into the no-auth pool and dispatch with no Authorization header) to "one-time-initial", and the provider's hasFree flag is now false — matching freeTierCatalog.ts, which already excluded publicai. (#3558)

  • fix(gemini-web): a missing Playwright Chromium browser no longer loops and trips the provider breaker — when the browser binary is not installed, chromium.launch() threw an error surfaced as a retryable 500, so accountFallback marked the account unavailable and retry-looped. It is now classified as a host/config problem and returns 503 with an actionable message (npx playwright install chromium) and the X-Omni-Fallback-Hint: connection_cooldown header, which skips the provider circuit breaker and applies a short non-exponential cooldown. (#3516)

  • fix(proxy): the SOCKS5 proxy option now follows the runtime ENABLE_SOCKS5_PROXY env instead of the build-time NEXT_PUBLIC_ENABLE_SOCKS5_PROXY — Next.js inlines NEXT_PUBLIC_* at build time, so a prebuilt Docker image ignored a runtime setting and the SOCKS5 type stayed hidden. The proxy modal now reads socks5Enabled from GET /api/settings/proxies (server-side ENABLE_SOCKS5_PROXY), with the build-time value kept only as a static-deploy fallback. (#3508)

  • fix(playground): the playground model selector now lists models from custom-endpoint (OpenAI/Anthropic-compatible) providers — it filtered /v1/models by the provider's connection id, but the catalog emits compatible-provider models under the node's custom prefix (prefix/model), so the list came up empty ("None"/"-"). The selector now filters by the node prefix (exposed additively as modelPrefix on provider options; the connection id is unchanged, so translator send/translate and connection lookups are unaffected). (#3505)

  • fix(usage): the Kiro quota card no longer renders a blank when the account returns no usage breakdown — getKiroUsage returned quotas:{} for a successful GetUsageLimits response without a usageBreakdownList (observed with some AWS IAM / Builder ID accounts), which the dashboard showed as an unexplained empty card. It now returns an informative message (surfaced via the card's connection-message path). (#3506)

  • fix(security): route raw err.message through sanitizeErrorMessage() in five web executors (adapta-web, deepseek-web, perplexity-web, qoder, veoaifree-web) and the embeddings + search handlers (Hard Rule #12) — these built error response bodies from the raw upstream/exception message, which could leak internal detail. (#3494, #3495)

  • fix(dashboard): correct two dashboard fetches that hit non-existent routes (404) — CustomHostsManager called /api/tools/traffic-inspector/custom-hosts (the real route is /hosts), and FeatureFlagsGrid's post-restart liveness probe called /api/health (the real lightweight endpoint is /api/health/ping). (#3486, #3487)

  • chore(providers): remove the dead krutrim registry entry — it was half-registered (present in providerRegistry.ts with a baseUrl + one model, but absent from providers.ts, with no executor/translator/OAuth), so it was never selectable. Dropped its ProviderIcon entry and the KNOWN_REGISTRY_ONLY exception. (#3483)

  • docs(api): fix the agent-bridge per-agent state route in openapi.yaml and AGENTBRIDGE.md — both documented /api/tools/agent-bridge/agents/{id}/state, which has no route; corrected to the real per-agent /api/tools/agent-bridge/agents/{id} (global state remains /api/tools/agent-bridge/state). (#3489)

  • docs(api): correct API_REFERENCE.md endpoints that documented non-existent routes — skills (PUT /api/skills/[id], POST/GET /api/skills/executions), plugins ([id][name], activate/deactivate), ACP (DELETE/POST /api/acp/agents via ?id/{action:"refresh"}), cache (DELETE /api/cache/reasoning, /api/cache/entries), and removed the fabricated /api/admin/circuit-breaker, /api/admin/rate-limits, and /api/system-info (admin only exposes /concurrency). (#3497)

  • fix(executor): strip provider prefix from versioned built-in tool model field — Anthropic rejects tools[N].model: "cc/claude-opus-4-8" from Claude Code's advisor_20260301 and similar versioned built-in tools; the native Claude OAuth execute path now strips any provider prefix from model on tools whose name matches name_YYYYMMDD. (#3532 — thanks @ggiak)

  • fix(dashboard): handle DEGRADED and unknown provider breaker states on the Runtime page — an unrecognised breaker state (e.g. DEGRADED) caused a crash because the styling map had no entry ...

Read more
Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.19

10 Jun 05:52
@diegosouzapw diegosouzapw
6306800
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

Focused quality-infrastructure release: the complete quality-gate ratchet + anti-hallucination guardrail system (Phases 0–6 + fast-tracked 6A.1/6A.2). No external PRs were taken this cycle by design — community PRs carry over to the next cycle.

✨ New Features

  • feat(quality): quality-gate ratchet + anti-hallucination/rule-enforcement guardrails (Phases 0–6) — generic multi-metric ratchet engine (quality-baseline.json + collector + comparator, regression-only) and ~18 deterministic gates wired into CI: provider-consistency, dashboard fetch()→route and OpenAPI/docs→route resolution (anti-hallucination), dependency allowlist (anti-slopsquatting), file-size/duplication/complexity ratchets (frozen debt only shrinks), anti test-masking (assert-removal/tautology detection on PR diffs), error-helper (Hard Rule #12), public-creds (Rule #11), route-guard membership (Rules #15/#17), db-rules (Rules #2/#5), known-symbols (executors/strategies/translators), migration numbering. Re-enabled the cheap pre-commit hook, tiered npm audit, reconciled the CI coverage gate (40→60) and wired 3 orphaned contract gates. (#3471 — thanks @diegosouzapw)
  • feat(quality): test-discovery gate + 135 orphan tests re-wired + vitest in CI (fast-tracked Phase 6A.1/6A.2) — new check:test-discovery proves every *.test.ts|tsx is collected by a runner that actually executes (15 collectors with textual drift-check; orphans frozen in a shrink-only baseline). Found 195 orphan test files (incl. authz/routeGuard.test.ts guarding Rules #15/#17 — already rotten); 135 re-wired into the node runner via explicit-braces recursive globs across all scripts + 4 CI call sites; the remaining 60 are categorized debt. New test-vitest CI job: test:vitest blocking (146/146), test:vitest:ui informational (14 pre-existing UI-drift fails, triage 2026年06月16日). (#3536 — thanks @diegosouzapw)

🔧 Bug Fixes

  • fix(authz): restored the missing BYPASS_PREFIX_NOT_ALLOWED schema guard (Hard Rules #15/#17) — the zod refine documented as layer-1 in routeGuard.ts was absent from the live settingsSchemas.ts, so PATCH /api/settings accepted spawn-capable prefixes (e.g. /api/cli-tools/runtime/) into the manage-scope bypass list (the layer-2 runtime predicate still refused to honour them). Surfaced by re-wired orphan tests AC-8/AC-10c, which now stand as the permanent regression guard. (#3536 — thanks @diegosouzapw)
  • fix(db): closeDbInstance()/resetDbInstance() now fire the stateReset.ts module-state resetters (previously only backup-restore did) — apiKeys.ts kept a process-level schema memo across a recreated DB, so the stale re-prepare exploded with no such column: is_active and clients received 503 instead of 403 for an invalid bearer; the same path hit production when restoring an older backup snapshot. Includes a dedicated regression test; a test that had accommodated the buggy 503 now asserts the deterministic 403. (#3536 — thanks @diegosouzapw)

🔒 Security

  • fix(security): block the cloud-metadata SSRF pivot in the cli-tools catalog fetch (CodeQL js/request-forgery, critical) — fetchOmniRouteCatalog() built its /v1/models URL from a user-controlled baseUrl and fetched it. Since the legitimate target is the user's own OmniRoute (loopback), the public-only guard can't apply; assertSafeCatalogUrl() now blocks the cloud-metadata/link-local pivot (169.254.169.254, metadata.google.internal, ...) unconditionally, plus non-http(s) protocols and embedded credentials, and the request fetches the re-parsed (taint-severed) URL. Loopback and public OmniRoute Cloud targets stay allowed. (#3544 — thanks @diegosouzapw)

📝 Maintenance

  • docs(quality): Phase 6A critical-audit plan + Phase 7 community-tooling additions, both stored with an activation gate of 2026年06月16日 — 6A: stale-allowlist enforcement, ratchet --require-tighten, gate scope expansions, remaining orphan/UI-suite triage; Phase 7 additions: gitleaks (Betterleaks noted), actionlint + zizmor, SPDX license compliance. (#3530 — thanks @diegosouzapw)
  • chore(quality): conscious, documented re-baselines so the quality-gate debuts holding the REAL published line — file-size frozen at current sizes for 9 files that grew in the v3.8.18 era (RequestLoggerV2 +281, stream +101, combo +73, chatCore +45, ...) and eslintWarnings 3482→3501 (the published v3.8.18 tag already measured 3501; this cycle is neutral). Driving both down is Phase 6A work. (#3538 — thanks @diegosouzapw)
  • chore(release): open the v3.8.19 development cycle (version bump + electron lockfile sync) and ignore generated yt-downloader artifacts. (thanks @diegosouzapw)
  • test: release-gate stabilization — the re-wired suites + the debuting CI gates surfaced and fixed 6 latent test defects: 2 suites depended on the dev machine's configured password (now hermetic), the breaker reset-timeout test ran on a 5ms margin, the bypass-prefix schema test consecrated the pre-#3536 bug, the chatcore upstream-timeout test had a structurally-broken pending-detail predicate (tested .providerRequest on an array — never passed isolated, even at the published v3.8.18 tag), and internal planning docs were excluded from the docs-symbols gate. Coverage floors re-baselined to the honest post-re-wire denominator (78.4% measured: previously-never-imported modules now count). (thanks @diegosouzapw)

What's Changed

Full Changelog: v3.8.18...v3.8.19

Contributors

diegosouzapw and rafacpti23
Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.18

09 Jun 19:24
@diegosouzapw diegosouzapw
8169b97
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

✨ New Features

  • feat(ui): unified Active + Finished requests into a single view — the dashboard now shows in-flight and completed requests in one list with deep-linking, live streaming detail, and a dedicated /api/logs/[id] detail route; pending requests are tracked per connection and finalized as they complete. (#3401 — thanks @hartmark / @diegosouzapw)
  • feat(plugins): plugin lifecycle hooks + theme-manager example — adds onInstall/onActivate/onDeactivate/onUninstall lifecycle events dispatched by the plugin manager, thins index.ts to a backward-compatible re-export shim over hooks.ts, and ships theme-manager + request-logger example plugins. (#3473 — thanks @oyi77 / @diegosouzapw)
  • feat(browserPool): Playwright proxy resolved from the proxy registry — browser-backed providers (claude-web/gemini-web) now route through the configured per-provider/global proxy instead of connecting directly, matching how OAuth/token-refresh already honor resolveProxyForProvider (closes the VPS IP-rate-limit gap for the browser path). Fully additive with graceful degradation. (#3492 — thanks @borodulin)

🔧 Bug Fixes

  • fix(executor): Llama / OpenAI-compat base URL normalization — a baseURL without a path (e.g. llama.example.foo) or with a non-/v1 path (e.g. bar.example.com/foo) now correctly gets /v1/chat/completions appended, fixing the 404 on message sends while GET /model still worked. (#3519 — thanks @hartmark)
  • fix(sse): empty-choices chunks without usage are dropped instead of injecting retry text — a streamed chunk carrying an empty choices array and no usage is now silently skipped rather than emitting placeholder retry text into the stream, eliminating spurious content for clients that send such keepalive-style frames. (#3513 — thanks @diegosouzapw)
  • fix(types): restored a clean typecheck:core — typed getPendingRequests() to its real shape (Record<string, Record<string, number>>) so the unified-requests view (#3401) no longer treats pending counts as unknown, cast the streamChunks log payload to its declared type, and aligned preScreenTargets (#3169) to the canonical IsModelAvailable signature (sync-or-async, normalized via Promise.resolve). (thanks @diegosouzapw)
  • fix(opencode-plugin): repaired the corrupted index.ts that broke the npm publish-opencode-plugin build (introduced by the #3435 branch) — removed two duplicated code blocks (apiFormat + debug-logging), dropped the local normaliseFreeLabel superseded by the naming.ts extraction, fixed an undefined sdkBaseURL reference, declared the missing startupDebug / logLevel feature-schema fields, and fixed shortProviderLabel dropping the prefix on a long displayName with no alias. Plugin now builds (DTS clean) with all 254 tests green. (#3435 — thanks @diegosouzapw)
  • fix(catalog): Codex CLI model-catalog refresh no longer errors — GET /v1/models now returns a top-level models: [] array for Codex clients (detected via the originator / user-agent = codex_* headers it sends on GET /v1/models?client_version=...), so codex_models_manager stops failing to decode the OpenAI-standard response and no longer logs failed to refresh available models on every startup. The array is intentionally empty: Codex replaces its built-in per-model agent prompt (base_instructions, ~21k chars) with whatever a populated entry carries for the selected model, so emitting our catalog would break Codex's agent behaviour — an empty list keeps Codex on its built-in model info (same inference as before, minus the error). Non-Codex OpenAI clients receive the unchanged {object,data} response. (#3481 — thanks @diegosouzapw)
  • fix(provider): Cursor's Responses-API-shaped bodies on /chat/completions are detected and handled — a body with input but no messages is now classified as openai-responses (instead of forcing openai and building from undefined messages → upstream 400); standard OpenAI clients are unaffected by the messages===undefined guard. (#3490 — thanks @borodulin)
  • fix(sse): numeric provider IDs normalized to strings across 4 more surfaces — extends #3427 to the Responses-API SSE passthrough (response_id/item_id/call_id), the buffered/flush path in stream.ts, the dedup-key builders, and sseParser.ts, preventing undefined lookups when IDs arrive as numbers. (#3451 — thanks @disafronov)
  • fix(theoldllm): X-Request-Token generated server-side, dropping the Playwright dependency — replicates the site's client rie() token (djb2 hash + oldllm-client-2026 seed + UA prefix + 8-hex crypto.randomUUID suffix) directly, so The Old LLM no longer needs a headless browser to mint tokens. (#3491 — thanks @borodulin / @diegosouzapw)
  • fix(combo): parallel pre-screen + circuit-breaker fast-exit for priority combos — provider profiles and model availability for all targets are pre-screened concurrently (max 5), and targets whose circuit breaker is OPEN are skipped immediately, reducing first-token latency on multi-target priority combos. (#3169 — thanks @pizzav-xyz)
  • fix(authz): URL-tokenized client endpoints (/api/v1/vscode/<key>/...) authenticate again when the caller sends its own non-OmniRoute Authorization header — a non-Bearer <token> header (e.g. VS Code Copilot's own, or an empty Bearer ) no longer short-circuits auth; it falls through to the path-scoped URL token (still validated downstream), instead of 401'ing under REQUIRE_API_KEY=true. (#3504 — thanks @zhiru / @diegosouzapw)
  • fix(playground): the dashboard provider Test playground works under REQUIRE_API_KEY=true — it previously sent the masked key (sk-xxxx****yyyy) as a bearer (always invalid → 401). It now authenticates via the dashboard session and sends only the key id (x-omniroute-playground-key-id); the gateway resolves the secret server-side, honored only for an authenticated session and never putting the key secret on the wire. (#3503 — thanks @zhiru / @diegosouzapw)

📝 Maintenance

  • feat(docs): doc-accuracy gate — new npm run check:fabricated-docs (scripts/check/check-fabricated-docs.mjs) indexes the codebase (api routes, env vars, CLI commands) and flags API-path/env-var/CLI/hook/file-ref claims in docs/** + AGENTS.md that don't exist in source (soft-fail by default, --strict for CI; wired into check:docs-all). Also refreshes the AGENTS.md live counts against source. (#3510 — thanks @oyi77)
  • chore: ignore local quality reports and prompt artifacts (quality-metrics.json, PLANO-/RELATORIO-QUALITY-GATES.md, stray prompt .txt files) so they no longer surface in git status. (thanks @diegosouzapw)

🔒 Security

  • fix(opencode-plugin): bounded the regex quantifiers in normaliseFreeLabel to close a polynomial-ReDoS (CodeQL js/polynomial-redos) — an unbounded \s* before an anchored \s*$ allowed O(n2) backtracking on attacker-influenced provider/model display names; bounded to {0,8}/{1,8}. (thanks @diegosouzapw)

What's Changed

Full Changelog: v3.8.17...v3.8.18

Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.17

09 Jun 10:58
@diegosouzapw diegosouzapw

Choose a tag to compare

✨ New Features

  • feat(providers): LMArena provider — routes requests to the LMArena battle platform via the new lmarena executor; supports streaming chat completions. (#3421 — thanks @oyi77)
  • feat(providers): ZenMux provider — adds the zenmux executor for ZenMux's OpenAI-compatible endpoint with streaming support. (#3429 — thanks @oyi77)
  • feat(providers): Gemini Business provider — adds the gemini-business executor (Phase 2C of the Google provider expansion), enabling Gemini models via Google Workspace accounts. (#3436 — thanks @oyi77)
  • feat(plugin+api): auto-combos API + free model quota display — new GET /api/combos/auto endpoint lists dynamically scored combos; provider pages now surface free-tier quotas inline; MCP-plugin surface extended to match. (#3435 — thanks @mrmm)
  • feat(opencode-plugin): per-prefix API format selection, debug logging, and free-label normaliser — three backports from the mrmm fork: each route prefix can specify its own wire format (OpenAI / Anthropic / Gemini), structured debug output is toggled via env var, and free-tier labels are normalized across providers. (#3420 — thanks @herjarsa)
  • feat(connections): connection pagination, health filter, batch-delete confirmation, and custom banned keywords — the provider connections table is now paginated; a health-state filter lets operators show only healthy/degraded/failed connections; multi-select + confirm dialog for bulk deletes; per-connection keyword denylist for content safety. (#3454 — thanks @sdfsdfw2)
  • feat(settings): Endpoint Token Saver visibility toggle — operators can now show or hide the Token Saver widget on the endpoint page from Settings → Appearance. (#3461 — thanks @rdself)
  • feat(catalog): model catalog name feature flag — a new feature flag controls whether the catalog exposes provider-prefixed model names, letting deployments opt into the legacy bare-name format for downstream tooling compatibility. (#3464 — thanks @rdself)

🔧 Bug Fixes

  • fix(translator): Vertex AI tool calls no longer fail with 400 Unknown name "id" — the OpenAI-style id field is stripped from functionCall/functionResponse parts for vertex/vertex-partner; the public Gemini API still receives id as required for Gemini 3+ signature matching. (#3457 — thanks @nullbytef0x / @diegosouzapw)
  • fix(claude): Claude Code claude-opus-4-8 tool calls no longer break with tool call could not be parsed — OmniRoute no longer force-injects interleaved-thinking / advanced-tool-use / effort beta flags the client never negotiated; clients sending their own anthropic-beta header control those betas themselves. (#3458 — thanks @Forcerecon / @diegosouzapw)
  • fix(catalog): imported/custom models on no-auth providers (e.g. The Old LLM) now appear in GET /api/v1/models and the Playground model selector — the eligibility gate required a DB connection row which no-auth providers never have, silently dropping every imported model for them. (#3463 — thanks @tjengbudi / @diegosouzapw)
  • fix(browser): optional cloakbrowser import no longer causes bundle errors when the package is absent — the import is now wrapped in a dynamic require so the build succeeds on environments that don't install the optional dep. (#3460 — thanks @rdself)
  • fix(claude-web): claude-web session handling cleanup — corrects an edge case where session cookies were not properly refreshed after a Turnstile challenge, and removes stale wrapper code left over from the provider split. (#3449 — thanks @androw)
  • fix(analytics): SQL named params are now scoped per query context — a shared params object was being mutated across concurrent analytics queries, causing SQLITE_MISUSE: named parameter not found errors under load. (#3447 — thanks @ReqX)
  • fix(command-code): chat endpoint reverted to /alpha/generate and model-sync discovery fixed — a prior refactor incorrectly targeted the wrong path, causing Command Code completions to silently 404; model listing now also resolves from the correct discovery endpoint. (#3432 — thanks @TapZe)
  • fix(command-code): CLI version header aligned to current Command Code release — the X-Command-Code-Version header value was pinned to a stale version string, causing upstream version-gated features to be rejected. (#3462 — thanks @hevener10)
  • fix(sse): provider IDs are normalized to strings before lookup — numeric provider IDs (e.g. from legacy DB rows) caused undefined lookups in the executor registry; all IDs are now coerced to string at the SSE entry point. (#3427 — thanks @disafronov)
  • fix(stream): textual tool-call slicing index mismatch resolved and containsTextualToolCallMarker deduplicated — two related bugs in the rolling-buffer parser caused partial tool-call chunks to be emitted twice or sliced from the wrong offset, producing garbled JSON in streamed tool responses. (#3413 — thanks @Ardem2025)
  • fix(stream): OpenAI usage-only chunks (empty choices: []) are now passed through instead of being dropped — some providers emit a trailing stats-only chunk after the last content delta; discarding it caused usage counters to be missing in logged responses. (#3422 — thanks @xz-dev)
  • fix(translator): empty-string reasoning_content replaced with placeholder on cache miss — injectEmptyReasoningContentForToolCalls pre-sets reasoning_content="" before the cache lookup; the old guard checked for undefined, never firing on miss and leaving "" in place, which DeepSeek V4+ rejects with a 400. (#3433 — thanks @ViFigueiredo)
  • fix(catalog): combos auto-compute context_length for any provider-ID form — the context-length resolution only matched exact-string provider IDs, missing combos declared with a numeric or aliased ID; the lookup now normalizes before matching. (#3417 — thanks @herjarsa)
  • fix(healthcheck): container bridge network IP probed correctly — the healthcheck script was hard-coded to localhost which resolves to IPv6 ::1 inside some container runtimes; it now queries the bridge gateway IP so the probe succeeds on both bridge and host networking modes. (#3434 — thanks @naimo84)
  • fix(publish): onnxruntime CUDA binary removed from npm tarball — the native .node binary exceeded npm's 413 payload limit and was never needed at runtime (OmniRoute uses the CPU build); the pack policy now excludes the CUDA artifact. (#3437 — thanks @herjarsa)

📝 Maintenance

  • docs: critical documentation gaps closed — new guides for ACP protocol, router strategies, compression, REST API reference, and updated AUTO-COMBO deep-dive; getting-started section added with Quick Start, Providers, Free Tiers, Auto-Combo, and Troubleshooting pages. (#3438 — thanks @oyi77)
  • docs(opencode-plugin): plugin README rewritten to lead with the why — positions the plugin as the recommended integration path over the legacy @omniroute/opencode-provider package, with migration guidance. (#3418 — thanks @herjarsa)
  • docs(env): COMMAND_CODE_VERSION override documented — environment variable added to .env.example and reference docs so operators can pin the CLI version header without a code change. (#3462 — thanks @hevener10)
  • test(auto-combo): same-provider connection identity assertion added — regression test covering the case where two connections for the same provider share an account ID, verifying the combo engine selects the correct one. (#3378 — thanks @oyi77)
  • deps: electron upgraded to 42.3.3; electron-builder to 26.15.2; electron-updater to 6.8.9; 4 development-group and 10 production-group packages bumped via Dependabot. (#3441 / #3442 / #3443 / #3444 / #3445 — thanks @diegosouzapw)
  • chore(release): v3.8.17 development cycle opened from main. (thanks @diegosouzapw)

What's Changed

  • deps: bump electron from 42.3.2 to 42.3.3 in /electron by @dependabot[bot] in #3441
  • deps: bump the production group with 10 updates by @dependabot[bot] in #3444
  • deps: bump the development group with 4 updates by @dependabot[bot] in #3445
  • deps: bump ele...
Read more
Loading
EDM115 reacted with hooray emoji
1 person reacted

v3.8.16

08 Jun 19:40
@diegosouzapw diegosouzapw

Choose a tag to compare

[3.8.16] — 2026年06月08日

✨ New Features

  • feat(vision-bridge): auto-routing to the fastest available vision model — when a request carries image content and the selected model does not support vision, OmniRoute now transparently delegates to the best-match vision-capable model instead of returning an error. (#3377 — thanks @herjarsa)
  • feat(web-session): web-session pool observability — new MCP tool get_web_session_pool_health and a health-matrix REST response (GET /api/web-session-pool/health) expose per-provider slot counts, lease ages, and error budgets so operators can diagnose pool exhaustion without digging through logs. (#3395 — thanks @oyi77)
  • feat(web-session): adaptive keepalive threshold — the keepalive heartbeat interval now self-adjusts based on observed provider idle-disconnect behaviour instead of using a fixed constant, reducing both unnecessary pings and unexpected session drops. (#3397 — thanks @oyi77)
  • feat(web-session): bulk credential import endpoint (POST /api/web-session/import) — import a JSON array of session credentials in one call; each entry is validated and inserted atomically, with per-entry success/failure reported in the response. (#3403 — thanks @oyi77)
  • feat(api): REST API for session pool health (GET /api/session-pool/health) — a dashboard-facing endpoint that aggregates live slot usage, wait-queue depth, and error rates across all active session pools; wired to a new dashboard widget. (#3404 — thanks @oyi77)

🔧 Bug Fixes

  • fix(sse): eliminate race window in usageTokenBuffer settings update — a concurrent save + stream-start could race to apply stale settings, causing token counts to roll back by up to 2 000 tokens after a restart; the update now uses an atomic read-modify-write on the shared settings ref. (#3405 — thanks @diegosouzapw)
  • fix(context-cache): server-side context-cache pinning now correctly persists across restarts; proxy message content no longer leaks into the upstream prompt; and the context_cache_protection toggle is properly saved to the DB on change. (#3399 — thanks @k0valik)
  • fix(providers): the provider settings page now refreshes its model list after a successful sync-models call — previously the stale list remained until a full page reload. (#3402 — thanks @0xtbug)
  • fix(stream): empty-choices chunks (choices array present but empty, no finish_reason) are now silently dropped rather than emitted as a retry: SSE event — removes spurious retry lines from streaming responses for providers that emit heartbeat keep-alive chunks. (#3400 — thanks @0xtbug)
  • fix(account-fallback): the connection cooldown deduplication state is now preserved across the fallback retry chain — previously a second concurrent failure on the same account could clear the dedupe flag set by the first, allowing the cooldown window to be extended twice. (#3381 — thanks @oyi77)
  • fix(stream): false-positive textual tool-call marker truncation — containsTextualToolCallMarker now tracks how much of the accumulated streamed content has already been emitted, so it only withholds the unemitted tail rather than re-scanning from the start on every new chunk. (#3382 — thanks @Ardem2025)
  • fix(sanitizer): containsTextualToolCallContent() now requires the complete [Tool call: name]\nArguments: header pattern instead of a bare .includes("[Tool call:") check — prevents the non-streaming response sanitizer from nulling out model responses that merely quote [Tool call:] in prose or code examples. (#3355 — thanks @diegosouzapw)
  • fix(stream): the streaming textual tool-call guard now flushes any remaining buffered content as plain text when the stream ends, regardless of whether the buffer contains "Arguments:" — previously, a partial/incomplete tool-call header that arrived at end-of-stream was silently dropped. (#3355 — thanks @diegosouzapw)
  • fix(executor): Mistral (and any provider in PROVIDERS_REQUIRING_USER_LAST_MESSAGE) no longer receives a trailing assistant message with plain text content — stripTrailingAssistantForProvider drops it on the upstream-send path, fixing the 400: Expected last role User or Tool ... but got assistant rejection. (#3396 — thanks @diegosouzapw)
  • fix(mitm): getMitmStatus() in the build-time stub (Docker image) now returns a graceful { running: false } status instead of throwing, so the Agent Bridge UI shows a clean "stopped" state rather than an error banner in containerised deployments. (#3390 — thanks @diegosouzapw)
  • fix(env): corrected casing of OMNIROUTE_TRACE in .env.example and all related documentation files — was previously mixed-case in some places, causing the variable to be silently ignored on case-sensitive file systems. (#3393 — thanks @androw)
  • fix(featureFlags): PRICING_SYNC_ENABLED description now clearly states that the feature requires the corresponding environment variable to be set — removes the ambiguity that led operators to enable it via the UI only and wonder why sync never ran. (#3394 — thanks @androw)
  • fix(docker): runner-web image now copies playwright and playwright-core from the builder stage instead of using npx to fetch them at build time — eliminates the exit 127 failure on GitHub-hosted runners where the registry download is unreliable.

📝 Maintenance

  • ci(docker): the CI pipeline now builds and publishes the -web image variant in the same Docker publish workflow, so both the standard and browser-backed images stay in sync on every release. (#3389 — thanks @zhiru)
  • ci(e2e): E2E shard suite hardened — timeout raised to 45 min for the heaviest shard; build artifact now uses an explicit tar bundle to avoid upload-artifact@v4 LCA path ambiguity; node_modules copied into standalone after download; browser cache added to cut cold-shard time; sync-models endpoint mocked in providers-management.spec.ts so the import modal reaches "done" immediately. (#3387 / #3392 — thanks @diegosouzapw)
  • docs: Codex CLI configuration guide added to the dashboard (/dashboard/codex-config) — covers profile naming, model selection, and the CODEX_* environment variables accepted by OmniRoute. (thanks @diegosouzapw)
  • chore(agentSkills): catalog expanded to 43 entries — config-codex-cli added as a new CONFIG_SKILL_IDS category; all skill-count assertions updated across unit and integration test suites; next-fetch opts cast to satisfy the TypeScript overload signature in the skill runner. (thanks @diegosouzapw)

🙌 Contributors

Thanks to everyone whose work landed in v3.8.16:

Contributor Contribution
@herjarsa Vision-bridge auto-routing to fastest vision model (#3377)
@oyi77 Web-session pool observability (#3395), adaptive keepalive (#3397), bulk credential import (#3403), session pool REST API (#3404), cooldown dedupe fix (#3381)
@Ardem2025 Stream false-positive tool-call marker truncation fix (#3382)
@zhiru Docker -web image variant CI (#3389)
@androw OMNIROUTE_TRACE casing fix (#3393), PRICING_SYNC_ENABLED clarification (#3394)
@k0valik Context-cache pinning + proxy message leak fix (#3399)
@0xtbug Empty-choices chunk drop (#3400), model list refresh after sync (#3402)
@diegosouzapw Release engineering + usageTokenBuffer race fix (#3405), sanitizer+stream hardening (#3355/#3410), Mistral trailing-assistant fix (#3396/#3409), mitm Docker stub (#3390/#3408), E2E shard stabilization (#3387/#3392), Docker -web build fix, and direct release-branch commits

What's Changed

  • fix(ci): stop E2E shard 5/6 being cancelled mid-run (timeout headroom) by @diegosouzapw in #3387
  • fix(ci): E2E shard headroom (50m) + live line reporter for diagnosis by @diegosouzapw in #3392
  • fix(env): correct casing of OMNIROUTE_TRACE in .env.example and related files by @androw in #3393
  • fix(featureFlags): update description for PRICING_SYNC_ENABLED to clarify environment variable requirement by @androw in #3394
  • fix(account-fallback): preserve provider cooldown dedupe state by @oyi77 in #3381
  • ci(docker): also build & publish the -web image variant by @zhiru in #3389
  • fix(stream): solve false-positive textual tool-call marker truncation by @Ardem2025 in #3382
  • fix(stream): drop empty ch...
Read more
Loading
EDM115 reacted with hooray emoji maisamali89 reacted with heart emoji
2 people reacted

v3.8.15

07 Jun 17:10
@diegosouzapw diegosouzapw
929caeb
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

✨ New Features

  • feat(error-rules): provider-specific error classification with scope — a declarative rules layer lets providers map upstream error shapes to the right resilience action (provider circuit-breaker vs connection cooldown vs model lockout) at the correct scope, instead of relying on generic status-code heuristics. (#3370 — thanks @herjarsa)

🔧 Bug Fixes

  • fix(combo): add 429 to PROVIDER_FAILURE_ERROR_CODES so a rate-limited target no longer drives an infinite retry loop — the combo now cools the target down and moves on. (#3366 — thanks @herjarsa)
  • fix(catalog): add a getTokenLimit fallback for combo targets with an unknown context window, so a target whose context can't be resolved no longer breaks token-limit computation for the combo. (#3369 — thanks @herjarsa)
  • fix(auto-combo): include no-auth providers in Auto-Combo declaratively (driven by provider metadata rather than a hard-coded list), so keyless providers are eligible candidates. (#3365 — thanks @oyi77)
  • fix(auto-combo): validate web-session credentials before selecting a web-cookie provider as an Auto-Combo target, so an expired/empty session doesn't get picked. (#3371 — thanks @oyi77)
  • fix(command-code): update the Command Code base URL from /alpha/ to /provider/v1/ (upstream moved the endpoint). (#3372 — thanks @TapZe)
  • fix(kiro): probe %APPDATA%\kiro\storage.db on Windows during Kiro auto-import, so the import finds the credential store where Kiro actually writes it on Windows. (#3375, fixes #3363 — thanks @diegosouzapw; reported by @Gerashka2)

📝 Maintenance

  • fix(migrations): restore 095_provider_node_custom_headers.sql — it was twice deleted from the release branch by a contributor branch's git rm of a duplicate getting folded into the squash merge; restored and guarded. (thanks @diegosouzapw)

🙌 Contributors

Thanks to everyone whose work landed in v3.8.15:

Contributor PRs / Issues
@herjarsa #3366, #3369, #3370
@oyi77 #3365, #3371
@TapZe #3372
@Gerashka2 reported #3363
@diegosouzapw maintainer — #3375 shepherding, migration restores

What's Changed

Full Changelog: v3.8.14...v3.8.15

Contributors

diegosouzapw, oyi77, and 3 other contributors
Loading
EDM115 reacted with hooray emoji a-dmx reacted with heart emoji
2 people reacted
Previous 1 3 4 5 25 26
Previous

AltStyle によって変換されたページ (->オリジナル) /