Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Design 313

ezigus edited this page Apr 18, 2026 · 2 revisions

Design: feat(ruflo): 08a — persist ruflo memory across CI runs via orphan git branch

Context

Ruflo's learned memory (patterns, failure history, optimizations) is lost between CI runs because the MCP daemon is ephemeral — it starts in ruflo_detect() and stops in ruflo_teardown() within a single pipeline execution. The existing ruflo_export_memory() (scripts/lib/ruflo-adapter.sh:518) writes to .claude-flow/data/memory-export.json and ruflo_import_memory() (:500) reads it back, but neither persists beyond the runner's filesystem lifetime.

The repo already has a proven orphan-branch persistence pattern: the shipwright-data branch (.github/workflows/shipwright-pipeline.yml:803-852) uses tmp-dir → orphan checkout → commit → push with a 5-retry loop and jitter to persist events.jsonl, costs.json, and budget.json. This pattern handles concurrent push conflicts and requires no external infrastructure beyond GITHUB_TOKEN.

Constraints:

  • Bash 3.2 compatible (no associative arrays, no readarray)
  • All new paths must fail-open (return 0) — pipeline must never break over memory
  • jq is available in CI (installed at workflow line 389)
  • ruflo memory export may only capture the KV store, not HNSW indices or Q-weights — this is a known limitation we accept
  • Concurrent CI jobs can race on git push to the same ref

Decision

Approach: Dedicated orphan branch refs/ruflo-memory

Add 4 functions to scripts/lib/ruflo-adapter.sh (after line 529) and 2 workflow steps to .github/workflows/shipwright-pipeline.yml. Use a separate orphan branch (refs/ruflo-memory) rather than reusing shipwright-data, following separation of concerns — ruflo memory has a different lifecycle, ownership, and growth profile than operational metrics.

Component Diagram

┌─────────────────────────────────────────────────────────────────┐
│ GitHub Actions Workflow (shipwright-pipeline.yml) │
│ │
│ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ Restore step │──────call───>│ ruflo_ci_memory_pull │ │
│ │ (before pipeline)│ │ (ruflo-adapter.sh) │ │
│ └──────────────────┘ └────────┬─────────────┘ │
│ │ │
│ git fetch + git show │
│ │ │
│ ┌────────▼─────────────┐ │
│ │ refs/ruflo-memory │ │
│ │ (orphan branch) │ │
│ │ └─ memory-export.json│ │
│ └────────さんかく─────────────┘ │
│ │ │
│ git commit + push (3x) │
│ │ │
│ ┌──────────────────┐ ┌────────┴─────────────┐ │
│ │ Persist step │──────call───>│ ruflo_ci_memory_push │ │
│ │ (after, always) │ │ (ruflo-adapter.sh) │ │
│ └──────────────────┘ └──────────────────────┘ │
│ │ │ │
│ ┌────────▼──┐ ┌─────▼──────────┐ │
│ │ prune │ │ merge │ │
│ │ (90-day) │ │ (union/ts-win) │ │
│ └────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

5 components:

# Component Responsibility Location
1 ruflo_prune_memory_export() Remove entries older than N days ruflo-adapter.sh:~530
2 ruflo_merge_memory_exports() Union-merge two exports, newer timestamp wins ruflo-adapter.sh:~555
3 ruflo_ci_memory_pull() Fetch from orphan branch → import into ruflo ruflo-adapter.sh:~580
4 ruflo_ci_memory_push() Export → prune → merge → push to orphan branch ruflo-adapter.sh:~610
5 Workflow steps Wire pull/push into CI lifecycle shipwright-pipeline.yml

Interface Contracts

# Prune entries older than max_age_days from a memory export JSON file.
# Rewrites file in-place via atomic tmp+mv.
# Returns: 0 on success or empty input, 1 on jq failure.
# Precondition: file_path is a valid JSON file (or missing → no-op).
# Postcondition: file contains only entries with timestamp within max_age_days.
ruflo_prune_memory_export(file_path: string, max_age_days: int) → 0|1
# Merge two memory export JSON files. Union of keys; on conflict, newer
# timestamp wins. Output written atomically to output_file.
# Returns: 0 on success, 1 on jq failure or missing inputs.
# Precondition: local_file exists; remote_file may not exist (→ copy local).
# Postcondition: output_file contains merged superset.
ruflo_merge_memory_exports(local_file: string, remote_file: string, output_file: string) → 0|1
# Fetch memory from refs/ruflo-memory and import into running ruflo instance.
# Guards: returns 0 immediately if ruflo unavailable or CI!=true.
# All errors → return 0, log warning. Never fails the pipeline.
ruflo_ci_memory_pull() → 0
# Export ruflo memory, prune, merge with remote, push to refs/ruflo-memory.
# Guards: same as pull. 3-retry loop with jitter on push conflict.
# All errors → return 0, log warning. Never fails the pipeline.
ruflo_ci_memory_push() → 0

Data Flow

Pull (CI start):

ruflo_ci_memory_pull()
 ├─ guard: ruflo_available? CI=true? → else return 0
 ├─ git fetch origin refs/ruflo-memory:refs/ruflo-memory
 ├─ git show refs/ruflo-memory:memory-export.json > $tmpfile
 │ └─ (fails on first-ever run → return 0, no prior memory)
 ├─ ruflo_with_timeout 30 _ruflo_run_quiet memory import --input $tmpfile
 ├─ emit_event "ruflo.ci_memory_pull_ok" "bytes=$(wc -c < $tmpfile)"
 └─ return 0

Push (CI end):

ruflo_ci_memory_push()
 ├─ guard: ruflo_available? CI=true? → else return 0
 ├─ ruflo_export_memory (existing function, writes memory-export.json)
 ├─ ruflo_prune_memory_export($export_file, 90) || true
 ├─ retry loop (attempts=1..3):
 │ ├─ git fetch origin refs/ruflo-memory:refs/ruflo-memory
 │ ├─ git show refs/ruflo-memory:memory-export.json > $remote_file
 │ ├─ ruflo_merge_memory_exports($export_file, $remote_file, $merged)
 │ │ └─ on failure: use $export_file as-is (local-only)
 │ ├─ SW_TMP=$(mktemp -d) && cd $SW_TMP
 │ ├─ git init + remote add + orphan checkout or fetch existing
 │ ├─ cp $merged memory-export.json + git add + git commit
 │ ├─ git push origin HEAD:refs/ruflo-memory
 │ │ └─ on success: break
 │ │ └─ on failure: sleep $((RANDOM % 5 + 2)), continue
 │ └─ cleanup $SW_TMP
 ├─ emit_event "ruflo.ci_memory_push_{ok|failed}"
 └─ return 0

Error Boundaries

Component Error Handling Propagation
ruflo_prune_memory_export jq parse failure Returns 1, file unchanged Caller uses || true
ruflo_merge_memory_exports jq failure or missing remote Returns 1 Caller falls back to local-only export
ruflo_ci_memory_pull No orphan branch (first run) git show fails → log, return 0 None — pipeline continues
ruflo_ci_memory_pull ruflo memory import fails Log warning, return 0 None
ruflo_ci_memory_push Push conflict (concurrent CI) Retry 3x with jitter After 3 failures: log, return 0
ruflo_ci_memory_push Export produces empty/invalid JSON Prune/merge get empty input → no-op Push empty or skip
Workflow steps Any failure continue-on-error: true Pipeline unaffected

Merge Strategy: Union with Timestamp Tiebreaker

The merge uses jq to combine two JSON exports:

  • Keys present in only one file → included
  • Keys present in both → entry with the newer .timestamp (or .updated_at) wins
  • Worst case: 2 of 3 concurrent jobs lose their new learning if all 3 push-conflict and only 1 wins. This is acceptable per the issue spec — KV memory accumulates over many runs.

Pruning Strategy

Before every push, entries older than 90 days are removed. This bounds the file size to roughly the volume of learning generated in 90 days of CI runs (expected <1MB). The cutoff is calculated as $(date -u -d "90 days ago" +%s) (Linux) with a date -v-90d fallback for macOS — though CI runners are Linux.

Alternatives Considered

  1. Reuse shipwright-data branch — Pros: no new orphan branch, infrastructure already exists. Cons: mixes operational metrics (events, costs, budget) with ML memory; different retention needs; concurrent push conflicts already happen there and adding more data worsens the race. Rejected: separation of concerns outweighs branch proliferation.

  2. GitHub Actions artifacts — Pros: no git conflicts, built-in retention. Cons: artifacts are per-workflow-run; cross-run retrieval requires REST API + run ID lookup; artifacts expire (default 90 days, configurable but not persistent). Rejected: not suitable for cross-run learning that should persist indefinitely.

  3. External storage (S3, GCS) — Pros: no git conflicts, unlimited scale. Cons: requires additional credentials, infrastructure provisioning, and IAM setup — violates the "zero external infrastructure" constraint. Rejected: over-engineered for a JSON file under 1MB.

Implementation Plan

  • Files to create: None
  • Files to modify:
    • scripts/lib/ruflo-adapter.sh — add 4 functions after line 529
    • .github/workflows/shipwright-pipeline.yml — add 2 steps (restore before pipeline ~line 384, persist after ~line 802)
    • scripts/sw-ruflo-adapter-test.sh — add 6 unit tests at end of file
  • Dependencies: None new. jq already installed (workflow line 389). git available. ruflo binary checked via ruflo_available.
  • Risk areas:
    • Push conflict under concurrent CI (mitigated: 3-retry + jitter, proven pattern from shipwright-data)
    • ruflo memory export format stability — if the JSON schema changes, prune/merge jq filters break (mitigated: defensive jq with // empty fallbacks)
    • date command portability — CI is Linux so GNU date -d works; not tested on macOS but CI functions are guarded by CI=true

API Design Applicability

Endpoint Specification / Error Codes / Rate Limiting / Versioning: Not applicable — this feature adds shell functions to a bash script and workflow steps to a YAML file. There are no HTTP endpoints, REST APIs, or client-facing interfaces. The "API" is the 4 bash function signatures documented in Interface Contracts above.

Validation Criteria

  • ruflo_ci_memory_pull returns 0 when CI is unset (guard works)
  • ruflo_ci_memory_pull returns 0 when ruflo_available returns false
  • ruflo_ci_memory_pull returns 0 when orphan branch doesn't exist (first run)
  • ruflo_ci_memory_push returns 0 under all error conditions (fail-open)
  • ruflo_prune_memory_export removes entries older than threshold, keeps newer ones
  • ruflo_prune_memory_export handles empty JSON gracefully
  • ruflo_merge_memory_exports produces union of disjoint keys
  • ruflo_merge_memory_exports keeps newer timestamp on conflicting keys
  • ruflo_merge_memory_exports falls back to local-only when remote is missing/invalid
  • Workflow restore step has continue-on-error: true
  • Workflow persist step has continue-on-error: true and if: always()
  • All existing tests in sw-ruflo-adapter-test.sh continue to pass
  • No Bash 3.2 incompatibilities (no associative arrays, no readarray)
  • Atomic file writes use tmp + mv pattern throughout

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /