Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/ h-codex Public

A semantic code search tool for intelligent, cross-repo context retrieval.

License

Notifications You must be signed in to change notification settings

hpbyte/h-codex

Repository files navigation

h-codex

A semantic code search tool for intelligent, cross-repo context retrieval.

✨ Features

  • AST-Based Chunking: Intelligent code parsing using Abstract Syntax Trees for optimal chunk boundaries
  • Embedding & Semantic Search: Using OpenAI's text-embedding-3-small model (support for voyage-code-3 planned)
  • Vector Database: PostgreSQL with pgvector extension for efficient similarity search
  • Multi-Language Support: TypeScript, JavaScript, and extensible for other languages
  • Multi-Project Support: Index and search multiple projects
  • MCP Integration: Seamlessly connects with AI coding assistants through Model Context Protocol

πŸš€ Demo

demo

πŸ’» Getting Started

h-codex can be integrated with AI assistants through the Model Context Protocol.

Example with Claude Desktop

Edit your claude_mcp_settings.json file:

{
 "mcpServers": {
 "h-codex": {
 "command": "npx",
 "args": ["@hpbyte/h-codex-mcp"],
 "env": {
 "LLM_API_KEY": "your_llm_api_key_here", 
 "LLM_BASE_URL": "your_llm_base_url_here (default is openai baseurl: https://api.openai.com/v1)",
 "DB_CONNECTION_STRING": "postgresql://postgres:password@localhost:5432/h-codex"
 }
 }
 }
}

πŸ› οΈ Development

Prerequisites

  • Node.js (v18+)
  • pnpm - Package manager
  • Docker - For running PostgreSQL with pgvector
  • OpenAI API key for embeddings

Getting Started

  1. Clone the repository

    git clone https://github.com/hpbyte/h-codex.git
    cd h-codex
  2. Set up environment variables

    cp packages/core/.env.example packages/core/.env

    Edit the .env file with your OpenAI API key and other configuration options.

  3. Install dependencies

    pnpm install
  4. Start PostgreSQL database

    cd dev && docker compose up -d
  5. Set up the database

    pnpm run db:migrate
  6. Start development server

    pnpm dev

πŸ”§ Configuration Options

Environment Variable Description Default
LLM_API_KEY LLM API key for embeddings Required
LLM_BASE_URL LLM Base url key for embeddings https://api.openai.com/v1
EMBEDDING_MODEL OpenAI model for embeddings text-embedding-3-small
CHUNK_SIZE Maximum chunk size in characters 1000
SEARCH_RESULTS_LIMIT Max search results returned 10
SIMILARITY_THRESHOLD Minimum similarity for results 0.5
DB_CONNECTION_STRING PostgreSQL connection string postgresql://postgres:password@localhost:5432/h-codex

πŸ—οΈ Architecture

graph TD
 subgraph "Core Package"
 subgraph "Ingestion Pipeline"
 Explorer["Explorer<br/>(file discovery)"]
 Chunker["Chunker<br/>(AST parsing & chunking)"]
 Embedder["Embedder<br/>(semantic embeddings)"]
 Indexer["Indexer<br/>(orchestration)"]
 Explorer --> Chunker
 Chunker --> Embedder
 Embedder --> Indexer
 end
 subgraph "Storage Layer"
 Repository["Repository"]
 end
 Indexer --> Repository
 Repository --> Database[(PostgreSQL Vector Database)]
 end
 subgraph "MCP Package"
 MCPServer["MCP Server"]
 CodeIndexTool["Code Index Tool"]
 CodeSearchTool["Code Search Tool"]
 MCPServer --> CodeIndexTool
 MCPServer --> CodeSearchTool
 end
 CodeIndexTool --> Indexer
 CodeSearchTool --> Repository
Loading

πŸ—ΊοΈ Roadmap

  • Support for additional embedding providers (Voyage AI)
  • Enhanced language support with more tree-sitter parsers

πŸ“„ License

This project is licensed under the MIT License

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /