Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

holo-q/ripmap

Repository files navigation

ripmap

Codebase cartography for LLMs.

ripmap surfaces structurally significant code. It uses tree-sitter parsing, PageRank on the symbol graph, git-aware temporal signals, and LLM-driven parameter optimization.

Example

$ ripmap .
# Ranking: high (dense) | 1168 symbols | ~10728 tokens
 src/ranking/pagerank.rs
 class PageRanker:
 def:
 compute_ranks(...)
 build_graph(...)
 pagerank(...)
 src/extraction/treesitter.rs
 class TreeSitterParser:
 def:
 extract_tags(...)
 get_language(...)
$ ripmap --focus "cache" --tokens 2048
$ ripmap --calls --tokens 2048

Installation

$ cargo install --path .

How It Works

ripmap is a recurrent graph neural network with 55 trainable coordinates.

┌──────────────────────────────────────────────────────────────────────────┐
│ Shadow Pass ──▶ PageRank ──▶ Policy Engine ──▶ Final Pass │
│ (high recall) (importance) (gating) (high precision) │
├──────────────────────────────────────────────────────────────────────────┤
│ 55 Trainable Coordinates │
│ ├── Shadow Strategy: name_match_weight, heuristic_confidence, ... │
│ ├── Final Strategy: type_hint_weight, import_weight, ... │
│ └── Policy: acceptance_bias, selection_temp, ... │
└──────────────────────────────────────────────────────────────────────────┘

The pipeline has two hemispheres. The Shadow Pass casts a wide net via fuzzy name matching. PageRank propagates importance through the symbol graph. The Policy Engine decides when to stop exploring. The Final Pass verifies candidates with LSP-style precision.

The coordinates control the physics:

  • PageRank damping (α): How far importance spreads through the graph
  • Strategy weights: Whether to trust name matching, type hints, or imports
  • Acceptance gates: Sigmoid thresholds for candidate quality
  • Focus decay: How quickly relevance fades from the query epicenter
  • Interaction mixing (λ): Interpolates between OR-logic and AND-logic

Traditional code has discrete branches. ripmap dissolves these into continuous coordinates. The system can be "30% more greedy" rather than flipping a switch.

The hypothesis: these 55 numbers are universal constants. A single configuration governs navigation in Python, Rust, TypeScript, Go—any codebase.

Training

The optimizer is not gradient descent. The optimizer is Claude.

$ ripmap-train --curated --reason --episodes 100 --agent claude

ripmap is trained by LLMs reasoning about why rankings fail. The LLM observes NDCG scores, analyzes failure cases, and proposes parameter adjustments with natural language rationale:

{
 "diagnosis": "High-PR distractors from unrelated modules flooding results",
 "proposed_changes": {
 "pagerank_alpha": ["decrease", "medium", "Reduce global spread"],
 "boost_focus_match": ["increase", "small", "Strengthen local signal"]
 },
 "confidence": 0.7
}

This is mesa-optimization. The "gradient" emerges from reasoning in concept space. The LLM can propose changes that no numerical gradient could express: "the multiplicative combination can't express OR-logic—add an additive pathway."

Ground truth comes from git history. Files that changed together in bugfix commits have causal dependencies. ripmap learns to predict these relationships.

Two-Level Stack

  • L1 (Inner Loop): Claude proposes parameter changes based on ranking failures
  • L2 (Outer Loop): Gemini evolves the prompt that steers L1's reasoning

L2 observes L1's performance across runs and mutates the promptgram—adding heuristics, adjusting policy, changing reasoning style. The prompt is a program.

Usage

ripmap [OPTIONS] [FILES]...
Options:
 -f, --focus <QUERY> Semantic search across symbol names
 -t, --tokens <N> Output token budget [default: 8192]
 -e, --ext <EXT> Filter by file extension
 -v, --verbose Show progress
 --calls Show call graph relationships
 --git-weight Boost recently changed files
 --join Concatenate full file contents

Configuration

Auto-detects from ripmap.toml, pyproject.toml, tsconfig.json, Cargo.toml, etc.

# ripmap.toml
include = ["src/**", "lib/**"]
exclude = ["**/generated/**"]

Performance

Stage Time
File discovery ~50ms
Tag extraction (cached) ~100ms
PageRank ~10ms
Total ~200ms

Languages

Python, Rust, JavaScript/TypeScript, Go, Java, C/C++, Ruby, PHP.

Documentation

License

MIT

About

Ultra-fast codebase cartography for LLMs

Resources

License

Stars

Watchers

Forks

Packages

No packages published

AltStyle によって変換されたページ (->オリジナル) /