forked from colbymchenry/codegraph
-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Complete Rust rewrite with 11 crates and MCP server#1
Open
EricDunaway wants to merge 78 commits into
Open
feat: Complete Rust rewrite with 11 crates and MCP server #1EricDunaway wants to merge 78 commits into
EricDunaway wants to merge 78 commits into
Conversation
- Complete crate structure for workspace layout - Module-by-module implementation plan with code examples - Dependency mapping from TypeScript to Rust equivalents - Security design ensuring no data leaves local machine - Adversarial scrutiny identifying completeness/accuracy issues - Security audit with attack surface analysis - Clarifying questions for implementation decisions https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
- Replace brute-force cosine similarity with sqlite-vec extension - Add sqlite-vec and zerocopy to dependencies - Implement VectorSearchManager using vec0 virtual tables - Add sqlite-vec initialization in DatabaseConnection - Update security audit with sqlite-vec considerations - Add embedding dimension validation - Update clarifying questions (quantization options) sqlite-vec is a pure C SQLite extension with no network access, making it suitable for the local-first security model. https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
- Replace direct ort usage with rust-bert's SentenceEmbeddingsModel - rust-bert provides high-level API with built-in tokenization - Same ort/ONNX backend, cleaner interface - Add EmbeddingModel enum for supported model types - Configure local-only model loading (no RemoteResource) - Set RUSTBERT_CACHE to prevent network downloads https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
- Platform-specific ort configuration: - macOS: ort with load-dynamic + coreml (GPU/Neural Engine) - Linux/Windows: ort with load-dynamic (CPU only) - TextEmbedder uses CoreMLExecutionProvider on macOS - Add ONNX Runtime setup instructions for no-network design - Document ORT_DYLIB_PATH environment variable requirement https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
- TypeScript/JavaScript decorator extraction with framework detection - Rust attribute macro extraction (#[derive], #[test], #[tokio::main]) - New codegraph_file_nodes MCP tool to list symbols in a file - Hybrid embedding strategy with dual-embeddings feature flag - StarEncoder for code, nomic-embed-text for comments/docstrings - Separate vec_code and vec_text tables - Auto-detection of query type (code vs natural language) https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
Key changes: - Resolved model conflicts: standardized on nomic-embed-text-v1.5 (768 dims) - Added full MCP tool implementations (search, context, callers, callees, impact) - Added ContextBuilder specification with semantic→text search fallback - Added missing types: SearchResult, CodeBlock, TaskContext, BuildContextOptions - Added database query layer additions (search_nodes, get_nodes_in_file, merge_subgraphs) - Removed Liquid language (deferred to future, not in target languages) - Fixed EmbeddingModel enum to use nomic-embed-text-v1.5 as default - Added Section 9.9 Future Enhancements (Liquid, Context7, Doc tool) - Identified TypeScript feature gaps and provided resolution options Breaking changes from TypeScript: - None - maintains full behavioral parity https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
Primary targets now include: - TypeScript/JavaScript (+ embedded GraphQL) - Rust (+ attribute macros) - PHP - Dart/Flutter (+ Riverpod, Bloc, get_it, GoRouter, embedded GraphQL) - GraphQL (standalone + embedded in TS/JS/Dart) - Bash (functions, source imports) - Terraform/HCL (resources, modules, variables, outputs, data) New extractors added: - DartExtractor with Flutter widget detection and annotation patterns - GraphQLExtractor for operations, fragments, variables - BashExtractor for functions and source imports - HclExtractor for Terraform blocks Framework patterns: - Flutter: StatelessWidget, StatefulWidget, ConsumerWidget - State management: Riverpod, Bloc, Provider - DI: get_it, injectable annotations - Routing: GoRouter, AutoRoute - Terraform: resource, module, variable, output, data blocks Dependencies added: - tree-sitter-dart = "0.0" - tree-sitter-graphql = "0.2" - tree-sitter-bash = "0.21" - tree-sitter-hcl = "1.1" https://claude.ai/code/session_01GTwi1SnurHF7uvEVwN94Ha
Complete implementation of CodeGraph in Rust with the following crates: Core infrastructure: - codegraph-types: Node, Edge, NodeKind, EdgeKind types (9 tests) - codegraph-db: SQLite + FTS5 database layer (6 tests) - codegraph-extraction: Tree-sitter AST parsing (13 tests) - codegraph-graph: BFS/DFS traversal, impact radius (10 tests) Analysis features: - codegraph-resolution: Reference resolution with name matching (13 tests) - codegraph-context: Context building for AI assistants (10 tests) - codegraph-vectors: ONNX embeddings with optional CoreML (18 tests) - codegraph-sync: Incremental sync + git hooks (15 tests) Integration: - codegraph-mcp: MCP JSON-RPC server with 7 tools (14 tests) - codegraph-core: Main library facade (12 tests) - codegraph-cli: Full CLI (init, index, sync, status, query, context, hooks, serve) Key changes: - Use direct ort instead of rust-bert for CoreML M4 support - 75+ exclude patterns (node_modules, target, .git, etc.) - 50+ builtin symbol filters (console, Promise, React, etc.) - Feature-flagged ONNX embeddings - FTS5 full-text search Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace regex-based extraction with proper tree-sitter AST parsing: - Add TreeSitterParser wrapper for multi-language grammar loading - Add configuration-driven LanguageConfig for node type mappings - Add TreeSitterExtractor that extracts nodes based on configuration - Extract decorators/attributes from TypeScript, Rust, Python, etc. - Add deduplication to prevent duplicate node extraction - Update to tree-sitter 0.20 API (functions instead of constants) - Add ParseFailed error variant for tree-sitter parse failures Supported languages: TypeScript, JavaScript, Rust, Python, Go, PHP, Java, C, C++, C#, Ruby, Bash Disabled (version conflicts): Swift, Kotlin, Dart, GraphQL, HCL Verified decorator extraction on real TypeScript code with @AppSyncQuery, @lambdafunction, @IAMPolicyAccessDynamoDBTable Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comprehensive design for enriching CodeGraph embeddings with: - LSP integration (TypeScript, Dart, Rust) for inferred types - Graph context (callers, callees, siblings, implements, extends) - New extraction (code snippets, thrown errors, test associations) - Module/package membership with semantic naming Key architectural decisions: - Separate enrichment phases after extraction (A1) - Graph context computed inline during embedding (A2, B2) - Hybrid LSP scope: comprehensive on index, selective on sync (L4) - Tiered truncation with token budget management (B1-B8) - Incremental updates with dependency tracking (I1-I13) Includes 8 conflict reviews ensuring internal consistency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Include [id: ...] in output for search, callers, callees, impact, and file_nodes tools so Claude can use returned IDs with other tools. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add #[serde(rename_all = "camelCase")] to ToolDefinition and ToolCallResult structs so inputSchema and isError serialize correctly - Handle notifications (requests without id) by returning None instead of "method not found" error - per JSON-RPC spec, notifications must not receive responses - Add test for notification handling Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
... API - Add model path resolution searching .codegraph/models/ (project then user level) - Update to ort 2.0 API (Tensor::from_array, ort::inputs! macro) - Auto-enable CoreML on macOS Apple Silicon via target-specific deps - Add token_type_ids for proper BERT-style model input - Fix borrow issues with mean_pool as standalone function - Add CODEGRAPH_NO_COREML env var to disable CoreML - Add test examples for embedder validation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Swift, Kotlin, Dart disabled pending tree-sitter version resolution. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add implementation plan with 50 tasks across 8 milestones - Add linked repos feature design document - Add .mcp.json for CodeGraph MCP server configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address critical issues found in adversarial review: - Add Task 2: Update Node struct with enrichment fields - Add Task 3: Update QueryBuilder for new columns - Add Task 21: Create codegraph-lsp crate with Cargo.toml - Add Task 30: Async/Sync bridge for LSP integration - Fix test setup helpers throughout - Correct tiktoken-rs API usage - Fix JSON comparison in batch SQL queries - Proper dependency ordering between tasks Total: 54 tasks across 8 milestones (up from 50) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Expand Tasks 31-54 with detailed test code and implementation - Add I4 (blocking execution), G5 (cycle safety), G6 (call frequency) - Add I6/I7 (retry logic, per-file error handling) with tests - Add L5 (server lifecycle) with lazy spawn/shutdown tests - Add all incremental update tasks (I11 edge diff, I12 file locking) - Add test convention detection (E6, E6a) for TS/Rust/Python - Add package/workspace detection (M1, M3) for monorepos - Add quality measurement framework (Q1) with A/B comparison - Consolidate review document (forward + reverse audits pass) - Change from tower-lsp to async-lsp (client vs server library) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
...hQL/HCL Tree-sitter upgrade: - Update tree-sitter core from 0.20 to 0.26 - Update all grammar crates to latest compatible versions - Migrate to new API: LANGUAGE constants instead of language() functions - Update parser.set_language() to take reference New language support: - Add Dart via tree-sitter-dart-orchard 0.3 - Add Swift via tree-sitter-swift 0.7 - Add GraphQL via tree-sitter-graphql 0.1 - Add HCL/Terraform via tree-sitter-hcl 1.1 - Add lang-dart to default features Language configs: - Dart: classes, methods, enums, mixins with @annotation decorator support - Swift: classes, structs, protocols, functions with @Attribute decorator support - GraphQL: types, interfaces, enums, operations, fragments with @directive support - HCL: blocks and attributes for Terraform resource extraction Note: Kotlin remains blocked by tree-sitter version constraint (>=0.21, <0.23) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Schema & Types (Tasks 1-6): - Add schema migration infrastructure with v2 enrichment columns - Add enrichment fields to Node struct (inferred_type, resolved_import_path, code_snippet, thrown_errors, test_names, package_name) - Update QueryBuilder for enrichment columns with graceful v1 fallback - Add enrichment config types (LspConfig, EnrichmentConfig, EmbeddingTextConfig) - Add JSON config file parsing with .codegraph/config.json support - Add metadata table operations for version tracking Auto-run migrations on database connection open. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
...one 2 partial) Tasks 7-9: Embedding Text Module - Add EmbeddingTextBuilder for constructing text representations - Implement GraphContext and NodeEnrichment structs - Add TokenCounter using tiktoken-rs cl100k_base (B5) - Implement tiered truncation with overflow protection (B3, B6) - Decorator-first ordering per design (E1) - Graph context limits (max_callees, max_callers, max_siblings) - 11 tests covering all functionality Dependencies: - Add tiktoken-rs = "0.6" to workspace Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New modules for code intelligence extraction: - snippet.rs: Code snippet extraction with configurable line limits and truncation markers (E4) - errors.rs: Thrown error type extraction via regex patterns for 10+ languages including TS, Rust, Python, Go, Java (E8) - test_detection.rs: Test file detection by naming conventions, import extraction, and hybrid test-symbol association (E6a, E6) - package.rs: Package name extraction from manifests (package.json, Cargo.toml, pubspec.yaml, go.mod, pyproject.toml, composer.json) with workspace/monorepo support (M1, M2, M3) Total: 95 tests passing in codegraph-extraction Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New modules for selective re-indexing and embedding: - lock.rs: File-based locking with 5-minute stale detection for exclusive indexing access (Task 39) - reembed.rs: Re-embed trigger detection via schema migration, config hash, model hash, and force flag (Task 40) - selective.rs: Cascade depth support for transitive dependency resolution with cycle detection (Task 41) - edge_diff.rs: Edge diffing utilities for change detection - sync.rs: Enhanced with SelectiveScope, enrichment stats, and lock integration (Task 42) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New crate providing LSP client integration for type inference: - client.rs: Generic LSP client with tower-lsp - lifecycle.rs: Server lifecycle management (init, shutdown) - enricher.rs: LspEnricher trait for language-specific enrichment - batch.rs: Batch enrichment with configurable concurrency - sync_bridge.rs: Async/sync bridge using tokio runtime Language-specific enrichers: - typescript.rs: TypeScript/JavaScript via tsserver - rust_analyzer.rs: Rust via rust-analyzer - python.rs: Python via pylsp/pyright - go.rs: Go via gopls - dart.rs: Dart via dart analysis server Supports hover-based type inference and import resolution. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
codegraph-types: - Add EnrichmentConfig with cascade_depth, token_budget, concurrency - Add enrichment fields to Node struct (inferred_type, code_snippet, etc.) codegraph-vectors: - Add EmbeddingTextBuilder for structured embedding text generation - Add token counting with tiktoken-compatible encoding - Add tiered truncation for token budget enforcement codegraph-db: - Add enrichment_deps table for dependency tracking - Add metadata table for embedding state (schema version, config hash) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
codegraph-core: - Add embedding.rs for orchestrating enrichment pipeline - Add EmbeddingError variants for enrichment failures codegraph-graph: - Add find_affected_nodes for impact analysis - Add cascade traversal for dependency resolution Workspace: - Add codegraph-lsp to workspace members - Update Cargo.lock with new dependencies Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security and correctness fixes identified by code review: 1. AbortHandle leak in Drop: Reader task wasn't aborted when LspClient was dropped, causing background task to persist. Added abort() call in Drop implementation. 2. Content-Length DoS: Reduced max Content-Length from 100MB to 10MB. LSP responses are typically <100KB, so 10MB is more than sufficient while reducing memory exhaustion risk. 3. UTF-16 position bounds: Changed utf16_to_byte to return Option<usize> instead of silently clamping out-of-bounds offsets. This prevents invalid positions from propagating through the LSP pipeline. Also added test_utf16_to_byte_out_of_bounds test to verify bounds checking. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add git-based staleness detection to MCP tools: - New git.rs module for checking dirty files via `git status --porcelain` - Query results now show warning when touching uncommitted files - New codegraph_status tool shows index health: file/node counts, last sync time, git hooks status, and list of dirty files This helps AI assistants know when index data may be outdated and prompts them to run `codegraph sync` when needed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add docs/issues.md tracking P0 bugs (missing edges, code_snippet) - Add docs/gaps.md tracking feature gaps vs competitors - Update CLAUDE.md MCP tools table with accurate status (✅/⚠️ /❌) - Add DB verification commands for extraction state - Document != escaping gotcha (use <> instead) - Fix sync command to use SyncManager properly - Fix insert_node to use INSERT OR REPLACE for re-indexing - Add upsert_file() call in index_all() to populate files table - Export SyncConfig from codegraph-sync Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@EricDunaway
EricDunaway
changed the title
(削除) Add comprehensive Rust rewrite plan with security analysis (削除ここまで)
(追記) feat: Complete Rust rewrite with 11 crates and MCP server (追記ここまで)
Feb 6, 2026
...ements Implement calls, imports, and extends/implements edge extraction from tree-sitter AST, fixing the 3 broken MCP tools (codegraph_callers, codegraph_callees, codegraph_impact) and partially broken codegraph_node. Key changes: - Extract call, import, and inheritance relationships during AST walk - Populate code_snippet on all nodes for codegraph_node display - Store unresolved references during indexing for post-extraction resolution - Propagate actual EdgeKind through resolver (was hardcoded to References) - Extract test functions from Rust mod_item blocks (386 new function nodes) - Expand BUILTIN_SYMBOLS with ~200 Rust/Go/JS stdlib names to reduce noise - Add scope-aware resolution preferring same-file matches over ambiguous cross-file matches, eliminating false positive edges Results on self-index: 3249 nodes, 2372 call edges, 545 import edges, 1437 remaining unresolved (down from 5574), 0 false positive stdlib edges. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
... (Tasks 11-12) - Add 4 integration tests: sync returns FullSyncResult, detects modified files, detects deleted files (verifies deleted_node_ids), detects new files - Fix full_reembed flag: only set true when model available and embeddings actually regenerated (Codex finding #2) - Add "embeddings skipped (model unavailable)" CLI message (Codex finding colbymchenry#3) - Create docs/plans/2026-02-12-codex-deferred-suggestions.md summarizing all deferred Codex suggestions across Tasks 2-10 - Update CLAUDE.md with embedding pipeline gotchas, sync crate description, plan doc references, and metadata/schema_version table docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive sync pipeline redesign plan covering change detection, extraction, resolution, and embedding as one end-to-end system. Enable onnx feature by default in CLI. Remove superseded incremental embedding sync plans. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...overage Changes &self to &mut self to clear node_cache/cache_order (matching the existing clear() method pattern). Updated test to verify unresolved_refs table and cache staleness after clear. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Review fix colbymchenry#4: Replace CodeGraphError::Other with CodeGraphError::Sync for lock errors in index_all(), preserving structured error handling. Review fix colbymchenry#5: Add file record and unresolved ref inserts to test_clear_all_graph_data so assertions on file_count and unresolved_refs count are non-vacuous. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change insert_edge to use INSERT OR IGNORE so duplicate edges are silently skipped (works with the unique index from v3 migration). Filter get_all_unresolved_refs to only return resolved=0 rows. Add four new query methods for scoped resolution: get_unresolved_refs_by_files, get_symbol_names_in_files, get_unresolved_refs_by_names_capped, and mark_unresolved_ref_resolved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds targeted pre-delete impact capture that queries only nodes affected by changed files, replacing the O(E) EdgeSnapshot approach on the hot path with queries proportional to changed files. Chunked at 500 IDs per SQL IN clause for safety. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Capture pre-delete neighbor/sibling impact in process_modify and process_delete before nodes are removed, accumulate across all file changes, and expose via SyncResult.pre_delete_impact for downstream embedding candidate computation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...ution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace EdgeSnapshot/EdgeDiff with ImpactCapture on the sync hot path: - Remove pre/post EdgeSnapshot capture (O(all edges) -> O(changed nodes)) - Use SyncResult.pre_delete_impact for neighbor/sibling detection - Replace resolve_all() with scoped resolve_for_files() for changed files - Compute embed candidates: changed ∪ impact ∪ siblings ∪ resolver - deleted - Add >30% changeset warning heuristic for full-reindex fallback (M3) - Retain legacy sync_embeddings(EdgeDiff) for --verify-sync mode (M4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...ble kinds Fix #1: Replace `let _ = insert_edge(...)` with proper error handling. On edge insert failure, log a warning and leave the ref unresolved (target set to None) so stats and scoped resolution bookkeeping remain consistent with persisted state. Fix colbymchenry#3: Add NodeKind::Component to EMBEDDABLE_KINDS array so framework components (React, Flutter, etc.) get embeddings for semantic search. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses git diff --name-status -z to identify changed files between a checkpoint commit and HEAD, much faster than full filesystem hash scan for incremental updates. Merges committed changes, working tree changes, and untracked files, filtering to supported languages and exclude patterns. Also makes compute_hash public for reuse across modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add checkpoint.rs to manage sync.last_head and sync.last_timestamp metadata entries. Provides read/write helpers for tracking the last synced git HEAD SHA, and a utility to resolve the current HEAD via git rev-parse. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add pending.rs to manage the sync.pending / sync.processing lifecycle. When a git hook fires while a sync is already running, PendingSync writes a coalesced event file that the active sync drains on completion. Uses atomic write-then-rename for safe concurrent access. Also adds SyncError::Other(String) variant for general error reporting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a method that combines lock acquisition with pending-write on collision. When the lock is held by another sync, writes sync.pending instead of failing, allowing hook events to be coalesced and drained by the running sync on completion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
... manager detection Major changes to git hooks management: - Add post-rewrite hook to the managed hooks list - New hook script template passes hook name via --hook flag - Rename backup suffix from .codegraph-backup to .codegraph-orig - Use git rev-parse --git-path hooks instead of hardcoded .git/hooks - Add Husky/Lefthook detection with --force bypass - Add conflict detection: refuse if .codegraph-orig backup already exists - Update GitHooksManager struct with force and repo_root fields - Update CLI callers to pass force=false to new constructor Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...hooks dir Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ensure_codegraph_in_gitignore() helper that creates or appends .codegraph/ to .gitignore when initializing a project inside a git repo. Failure to update .gitignore is non-fatal (logged as warning). Also includes sync redesign changes: sync_with_options() with full Phase 0-7 pipeline, SyncOptions struct, hook-mode support, checkpoint writes, pending drain loop, and sync.failed marker support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
... and validation - Add PID verification to lock release_internal() to prevent deleting locks owned by other processes - Add lock.refresh() calls at phase boundaries (after Phase 3, 4, and at start of Phase 5) in sync_with_options() to prevent stale lock detection during long syncs - Write sync.failed in CLI hook-mode error paths (CodeGraph::open failure and sync_with_options error) for diagnostic visibility - Add InvalidHookName error variant and validation guards to install_hook(), uninstall_hook(), and is_hook_installed() - Replace hook script template with full chaining implementation that preserves original hooks via .codegraph-orig backup, uses basename for hook name discovery, and runs sync in background with logging - Add --force flag to `codegraph hooks install` CLI command, passed through to GitHooksManager to bypass hook manager detection - Add .gitignore update for .codegraph/ after successful hooks install Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When verify_sync is true, captures EdgeSnapshot before and after sync, computes EdgeDiff, and compares affected nodes against ImpactCapture embed candidates to validate the incremental approach catches the same set of affected nodes as the full EdgeSnapshot approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EdgeDiff.affected_nodes includes IDs of deleted nodes (their edges disappeared), but compute_embed_candidates correctly excludes them since deleted nodes can't be re-embedded. Filter truly_deleted from diff_affected before comparing to prevent false "missed" reports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
...dFailed gracefully The 30% full-reindex threshold was triggering on tiny projects (1-2 files), losing granular sync stats and causing test failures. Added minimum 10-file guard. Also handle ModelLoadFailed in generate_embeddings() gracefully (return Ok(0) instead of propagating) since CoreML compilation can fail even when the model file exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix root CLAUDE.md: correct crate count (11→12), add codegraph-lsp, fix MCP tools table (all 8 verified working), document two-stage extraction pipeline, add enrichment_deps to schema, update plan refs - Create per-crate CLAUDE.md for 8 crates: extraction, core, sync, db, vectors, lsp, graph, mcp — each with public API, gotchas, patterns - Create docs/CLAUDE.md: standardized formats for plans, issues, gaps - Update docs/issues.md: remove 3 resolved P0s (code_snippet, edges, codegraph_node), keep valid P1s (signature, docstring) - Update docs/gaps.md: remove resolved code display gap - Standardize plan doc headers to Date/Status/Crates/Depends-on format - Archive completed plans to docs/plans/completed/ (sync redesign, LINKED_REPOS superseded by CROSS_LANGUAGE_LINKING) - Add plan cleanup and per-crate doc maintenance rules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The entire src/ directory and package-lock.json are remnants of the original TypeScript implementation, now fully replaced by the Rust workspace in crates/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add build_instructions() to McpServer, included in initialize response to guide AI agents on tool workflow, node ID usage, and prerequisites - Rewrite all 8 tool descriptions with actionable guidance: when to use each tool, what inputs they expect, and relationship to other tools - Promote codegraph_context as the primary entry point in docs - Update root CLAUDE.md and per-crate CLAUDE.md to match Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
... architecture Expand from AppSync-specific to general GraphQL linking. Add security model (path-based trust, operation allowlists), detail all affected crates including graph/context/sync/vectors, and flesh out extraction patterns for GraphQL schema, TypeScript resolvers, and Dart operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
Summary
Complete rewrite of CodeGraph from TypeScript to Rust, implementing a local-first code intelligence system with:
Key Features
Crate Structure
codegraph-types- Shared types (Node, Edge, NodeKind, EdgeKind)codegraph-db- SQLite with FTS5, schema, prepared statementscodegraph-extraction- Tree-sitter AST parsingcodegraph-resolution- Reference resolution, framework patternscodegraph-graph- BFS/DFS traversal, circular deps, dead codecodegraph-vectors- ONNX embeddings with CoreMLcodegraph-context- Context building for AIcodegraph-sync- Incremental updates, git hookscodegraph-mcp- MCP server (7 tools)codegraph-core- Orchestration layercodegraph-cli- CLI entry pointMCP Tools Status
codegraph_searchcodegraph_contextcodegraph_file_nodescodegraph_statuscodegraph_nodecodegraph_callerscodegraph_calleescodegraph_impactKnown Issues
See
docs/issues.mdfor tracked bugs:containsedges created (no calls/imports/extends)code_snippetfield not populatedsignatureanddocstringfields not populatedDocumentation
CLAUDE.md- Updated with verified tool status and gotchasdocs/issues.md- Bug trackingdocs/gaps.md- Feature gaps vs competitorsTest plan
cargo buildsucceedscargo testpasses🤖 Generated with Claude Code