Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

meccin/sleipnir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

2 Commits

Repository files navigation

Sleipnir

Agent = Model + Harness. This is the harness.

Sleipnir is a minimal, opinionated framework for disciplined AI-assisted development with Claude Code. Four files, two agents, one non-negotiable principle: every change must pass a contract before it counts as done.


Why "Sleipnir"

In Norse mythology, Sleipnir is Odin's eight-legged horse — the only steed that can ride through all nine realms, from Midgard to Hel and back. Three things make the name fit.

First, literally: Sleipnir is the horse that wears the harness. The name describes what the framework does.

Second, structurally: eight legs instead of four. A normal horse trades stability for speed, or speed for stability; Sleipnir has enough ground contact to get both at once. That is the promise of this framework — AI assistance that makes you faster without letting you drift.

Third, narratively: Sleipnir carries Odin between worlds. The harness carries the developer between the worlds of intent (spec.md), truth (contract.md), state (progress.md), and verdict (evaluation.md). The horse is not Odin; the horse is what makes Odin effective.

The two ravens

Odin sent two ravens across the nine realms each day and listened to what they brought back. Huginn ("Thought") and Muninn ("Memory"). Sleipnir uses both.

Huginn is the Implementer subagent. It reads the spec and contract, writes code, and updates progress. It never marks a task done.

Muninn is the Evaluator subagent. It holds the contracts in memory, runs the sensors (tests, lint, typecheck), and produces a verdict with file-and-line evidence. It never writes code.

There is an old line from the Eddas where Odin says he fears losing Muninn more than Huginn. Thought without memory becomes hallucination. That is the thesis of this framework.


The four files

Sleipnir runs on four markdown files per feature, plus an AGENTS.md at the repo root. Each one has a single owner and a single purpose.

spec.md holds the intent: what is being built, what is in scope, and the acceptance criteria. It is written by a human and reviewed by Muninn but never modified by Huginn.

contract.md holds the definition of "correct": verifiable rules, API shapes, schemas, invariants. Every assertion in this file must be checkable either by a sensor (test, linter, type checker) or by Muninn with evidence. This file is the framework's identity — without it, Sleipnir is just a wrapper for Claude Code.

progress.md holds the state of work in flight: current task, recently completed tasks, files touched this session, blockers, and the next step. Huginn updates it after every implementation run. This is what lets a fresh session pick up where the last one left off without re-learning the whole project.

evaluation.md holds Muninn's latest verdict: pass or fail per rubric dimension, each score backed by file-and-line evidence or command output. Overwritten every evaluation; previous verdicts live in git history.


The loop

 You write
 spec + contract
 │
 ▼
 ┌──────────────┐
 │ Huginn │ ◄─────────────┐
 │ implements │ │
 │ updates │ │
 │ progress │ │
 └──────┬───────┘ │
 │ │
 ▼ │
 ┌──────────────┐ │
 │ Muninn │ │
 │ evaluates │ │
 │ writes │ │
 │ evaluation │ │
 └──────┬───────┘ │
 │ │
 ┌────┴────┐ │
 │ PASS? │ │
 └─┬─────┬─┘ │
 │ no │ │
 └─────┼─────────────────────┘
 │ yes
 ▼
 Task done.
 Commit.

Two modes: Quick and Full

Sleipnir supports two modes, picked during init. The core loop is identical in both — what changes is how much ceremony each artifact carries.

Quick mode is for solo projects and maintenance-squad work. A task is a sentence. A spec is five lines. A contract is three to five assertions. Turnaround is minutes. Designed for the developer who already knows what to do and wants the contract-evaluator discipline without the process overhead.

Full mode is for teams that run sprint ceremonies, write user stories, care about CMMI alignment, or ship to stakeholders who read specs. User stories go in classic format, acceptance criteria in EARS notation, contracts as OpenAPI or JSON Schema where possible, and an extra planner step decomposes the spec into numbered tasks.

The rule that does not move: in both modes, contract.md exists and Muninn runs against it. Even three lines of contract is enough to turn "Claude made a change" into "the change passed a verifiable bar." That is the framework's identity and it does not negotiate with the mode.


Quick start

⚠️ init.sh is the bootstrapping tool. Other deliverables (subagents, slash commands, templates) land in subsequent steps; init.sh handles their absence gracefully.

Clone Sleipnir somewhere once:

git clone https://github.com/meccin/sleipnir.git ~/.sleipnir

Then from any project root:

# Interactive — recommended for first-time use.
# Five questions plus project metadata, then scaffolds.
~/.sleipnir/init.sh
# Non-interactive — accept all defaults (mode=quick, automation=semi-auto,
# language=en, team=small) and let stack auto-detection do the rest.
~/.sleipnir/init.sh -y
# Print help with all flags and examples.
~/.sleipnir/init.sh --help

The script auto-detects your stack from package.json, composer.json, pom.xml, or build.gradle, and offers it as the default in question 2.


Repository structure

sleipnir/
├── README.md # You are here
├── LICENSE # MIT
├── CHANGELOG.md # Versioned record of framework changes
├── init.sh # Bootstrap script (next delivery)
├── template/ # Files copied into a project by init.sh
│ ├── AGENTS.md # Template with placeholders
│ ├── .claude/
│ │ ├── settings.json # Hooks + permissions
│ │ ├── commands/ # Slash commands (harness-plan, -implement, -evaluate)
│ │ └── agents/ # Huginn and Muninn subagents
│ ├── specs/_example/ # Seed feature folder
│ ├── docs/adr/ # Architecture Decision Records
│ └── harness/ # config.yml (mode, stack, evaluator rubric)
├── modes/
│ ├── quick/ # Quick-mode template overrides
│ └── full/ # Full-mode template overrides
├── stacks/ # Per-stack sensor configurations
└── docs/
 ├── philosophy.md # The "why" behind every design choice
 └── progression.md # Adoption path: manual → orchestrated

Non-goals

Sleipnir deliberately does not try to be:

A planner AI. Claude Code already has plan mode, and your brain is better at scoping than any model.

A replacement for your team's task board. Sleipnir's progress.md tracks the current work session, not the backlog. Your Jira, Linear, or Trello stays where it is.

A CI/CD tool. Sensors run locally and in pre-commit hooks; CI is your existing pipeline.

An AGENTS.md generator. Those should be written by humans, kept short, and pruned often. Auto-generated context files have been shown to reduce task success rates.


Status

This repository is being built iteratively and in public. Each delivery is small enough to review in one sitting.

  • Step 1 — Skeleton + README
  • Step 2init.sh with interactive questionnaire
  • Step 3spec.md, contract.md, progress.md, evaluation.md templates (Quick + Full)
  • Step 4 — Template AGENTS.md
  • Step 5 — Huginn and Muninn subagents
  • Step 6 — Slash commands (/harness-plan, /harness-implement, /harness-evaluate)
  • Step 7 — Stack configurations (node-ts, react-playwright, custom; with package manager support for pnpm/npm/yarn/bun)
  • Step 8 (optional) — Full-auto mode refinement once Stage 3 of the progression is reached
  • Step 9 (optional) — Migration to a Claude Code plugin

License

MIT. See LICENSE.

About

A minimal harness for disciplined AI-assisted development for Claude Code. Two ravens, one contract.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /