A multi-model AI orchestration MCP server for automated code review and LLM-powered analysis. Multi-MCP integrates with Claude Code CLI to orchestrate multiple AI models (OpenAI GPT, Anthropic Claude, Google Gemini) for code quality checks, security analysis (OWASP Top 10), and multi-agent consensus. Built on the Model Context Protocol (MCP), this tool enables Python developers and DevOps teams to automate code reviews with AI-powered insights directly in their development workflow.
- π Code Review - Systematic workflow with OWASP Top 10 security checks and performance analysis
- π¬ Chat - Interactive development assistance with repository context awareness
- π Compare - Parallel multi-model analysis for architectural decisions
- π Debate - Multi-agent consensus workflow (independent answers + critique)
- π€ Multi-Model Support - OpenAI GPT, Anthropic Claude, Google Gemini, and OpenRouter
- π₯οΈ CLI & API Models - Mix CLI-based (Gemini CLI, Codex CLI) and API models
- π·οΈ Model Aliases - Use short names like
mini,sonnet,gemini - π§΅ Threading - Maintain context across multi-step reviews
Multi-MCP acts as an MCP server that Claude Code connects to, providing AI-powered code analysis tools:
- Install the MCP server and configure your AI model API keys
- Integrate with Claude Code CLI automatically via
make install - Invoke tools using natural language (e.g., "multi codereview this file")
- Get Results from multiple AI models orchestrated in parallel
Fast Multi-Model Analysis:
- β‘ Parallel Execution - 3 models in ~10s (vs ~30s sequential)
- π Async Architecture - Non-blocking Python asyncio
- πΎ Conversation Threading - Maintains context across multi-step reviews
- π Low Latency - Response time = slowest model, not sum of all models
Prerequisites:
- Python 3.11+
- API key for at least one provider (OpenAI, Anthropic, Google, or OpenRouter)
# Clone and install git clone https://github.com/religa/multi_mcp.git cd multi_mcp # Execute ./scripts/install.sh make install # The installer will: # 1. Install dependencies (uv sync) # 2. Generate your .env file # 3. Automatically add to Claude Code config (requires jq) # 4. Test the installation
If you prefer not to run make install:
# Install dependencies uv sync # Copy and configure .env cp .env.example .env # Edit .env with your API keys
Add to Claude Code (~/.claude.json), replacing /path/to/multi_mcp with your actual clone path:
{
"mcpServers": {
"multi": {
"type": "stdio",
"command": "/path/to/multi_mcp/.venv/bin/python",
"args": ["-m", "multi_mcp.server"]
}
}
}Multi-MCP loads settings from .env files in this order (highest priority first):
- Environment variables (already set in shell)
- Project
.env(current directory or project root) - User
.env(~/.multi_mcp/.env) - fallback for pip installs
Edit .env with your API keys:
# API Keys (configure at least one) OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... GEMINI_API_KEY=... OPENROUTER_API_KEY=sk-or-... # Azure OpenAI (optional) AZURE_API_KEY=... AZURE_API_BASE=https://your-resource.openai.azure.com/ # AWS Bedrock (optional) AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... AWS_REGION_NAME=us-east-1 # Model Configuration DEFAULT_MODEL=gpt-5-mini DEFAULT_MODEL_LIST=gpt-5-mini,gemini-3-flash
Models are defined in YAML configuration files (user config wins):
- Package defaults:
multi_mcp/config/config.yaml(bundled with package) - User overrides:
~/.multi_mcp/config.yaml(optional, takes precedence)
To add your own models, create ~/.multi_mcp/config.yaml (see config.yaml and config.override.example.yaml for examples):
version: "1.0" models: # Add a new API model my-custom-gpt: litellm_model: openai/gpt-4o aliases: - custom notes: "My custom GPT-4o configuration" # Add a custom CLI model my-local-llm: provider: cli cli_command: ollama cli_args: - "run" - "llama3.2" cli_parser: text aliases: - local notes: "Local LLaMA via Ollama" # Override an existing model's settings gpt-5-mini: constraints: temperature: 0.5 # Override default temperature
Merge behavior:
- New models are added alongside package defaults
- Existing models are merged (your settings override package defaults)
- Aliases can be "stolen" from package models to your custom models
Once installed in Claude Code, you can use these commands:
π¬ Chat - Interactive development assistance:
Can you ask Multi chat what's the answer to life, universe and everything?
π Code Review - Analyze code with specific models:
Can you multi codereview this module for code quality and maintainability using gemini-3 and codex?
π Compare - Get multiple perspectives (uses default models):
Can you multi compare the best state management approach for this React app?
π Debate - Deep analysis with critique:
Can you multi debate the best project code name for this project?
Edit ~/.claude/settings.json and add the following lines to permissions.allow to enable Claude Code to use Multi MCP without blocking for user permission:
{
"permissions": {
"allow": [
...
"mcp__multi__chat",
"mcp__multi__codereview",
"mcp__multi__compare",
"mcp__multi__debate",
"mcp__multi__models"
],
},
"env": {
"MCP_TIMEOUT": "300000",
"MCP_TOOL_TIMEOUT": "300000"
},
}Use short aliases instead of full model names:
| Alias | Model | Provider |
|---|---|---|
mini |
gpt-5-mini | OpenAI |
nano |
gpt-5-nano | OpenAI |
gpt |
gpt-5.2 | OpenAI |
codex |
gpt-5.1-codex | OpenAI |
sonnet |
claude-sonnet-4.5 | Anthropic |
haiku |
claude-haiku-4.5 | Anthropic |
opus |
claude-opus-4.5 | Anthropic |
gemini |
gemini-3-pro-preview | |
flash |
gemini-3-flash | |
azure-mini |
azure-gpt-5-mini | Azure |
bedrock-sonnet |
bedrock-claude-4-5-sonnet | AWS |
Run multi:models to see all available models and aliases.
Multi-MCP can execute CLI-based AI models (like Gemini CLI, Codex CLI, or Claude CLI) alongside API models. CLI models run as subprocesses and work seamlessly with all existing tools.
Benefits:
- Use models with full tool access (file operations, shell commands)
- Mix API and CLI models in
compareanddebateworkflows - Leverage local CLIs without API overhead
Built-in CLI Models:
gemini-cli(alias:gem-cli) - Gemini CLI with auto-edit modecodex-cli(alias:cx-cli) - Codex CLI with full-auto modeclaude-cli(alias:cl-cli) - Claude CLI with acceptEdits mode
Adding Custom CLI Models:
Add to ~/.multi_mcp/config.yaml (see Model Configuration):
version: "1.0" models: my-ollama: provider: cli cli_command: ollama cli_args: - "run" - "codellama" cli_parser: text # "json", "jsonl", or "text" aliases: - ollama notes: "Local CodeLlama via Ollama"
Prerequisites:
CLI models require the respective CLI tools to be installed:
# Gemini CLI npm install -g @anthropic-ai/gemini-cli # Codex CLI npm install -g @openai/codex # Claude CLI npm install -g @anthropic-ai/claude-code
Multi-MCP includes a standalone CLI for code review without needing an MCP client.
# Review a directory multi src/ # Review specific files multi src/server.py src/config.py # Use a different model multi --model mini src/ # JSON output for CI/pipelines multi --json src/ > results.json # Verbose logging multi -v src/ # Specify project root (for CLAUDE.md loading) multi --base-path /path/to/project src/
| Feature | Multi-MCP | Single-Model Tools |
|---|---|---|
| Parallel model execution | β | β |
| Multi-model consensus | β | Varies |
| Model debates | β | β |
| CLI + API model support | β | β |
| OWASP security analysis | β | Varies |
"No API key found"
- Add at least one API key to your
.envfile - Verify it's loaded:
uv run python -c "from multi_mcp.settings import settings; print(settings.openai_api_key)"
Integration tests fail
- Set
RUN_E2E=1environment variable - Verify API keys are valid and have sufficient credits
Debug mode:
export LOG_LEVEL=DEBUG # INFO is default uv run python -m multi_mcp.server
Check logs in logs/server.log for detailed information.
Q: Do I need all three AI providers? A: No, just one API key (OpenAI, Anthropic, or Google) is enough to get started.
Q: Does it truly run in parallel?
A: Yes! When you use codereview, compare or debate tools, all models are executed concurrently using Python's asyncio.gather(). This means you get responses from multiple models in the time it takes for the slowest model to respond, not the sum of all response times.
Q: How many models can I run at the same time? A: There's no hard limit! You can run as many models as you want in parallel. In practice, 2-5 models work well for most use cases. All tools use your configured default models (typically 2-3), but you can specify any number of models you want.
We welcome contributions! See CONTRIBUTING.md for:
- Development setup
- Code standards
- Testing guidelines
- Pull request process
Quick start:
git clone https://github.com/YOUR_USERNAME/multi_mcp.git cd multi_mcp uv sync --extra dev make check && make test
MIT License - see LICENSE file for details