学习 learn-claude-code 后的 PR Review Agent 实践项目:构建 Debate Council 多 Agent 审查、Tool Calling、结构化报告与 AI Judge 评估。
-
Updated
May 24, 2026 - Python
学习 learn-claude-code 后的 PR Review Agent 实践项目:构建 Debate Council 多 Agent 审查、Tool Calling、结构化报告与 AI Judge 评估。
Real-time, AI-judged competitive debate platform. Engage in structured 1v1 intellectual battles scored by Gemini AI on Logic, Facts, and Relevance.
An AI system that evaluates hackathon projects like a real judge.
Evaluation Pipeline for "Ideas Recall" telegram bot
PromptForge — privacy-first prompt engineering studio. Run one prompt across many OpenAI-compatible models in parallel (Xiaomi MiMo, OpenAI, DeepSeek, Groq, Ollama), compare side-by-side, and use one model as a judge. Next.js 16 + React 19 + TypeScript + Tailwind CSS 4.
Evaluate German translation answers on Sharplingo with AI, using your own model, API, and prompt template.
AI-powered Prompt Engineering Lab with Meta-Prompting and Auto-Evaluation (GPT-5.2 Judge)
🥊 Decide between 2-4 ideas with a structured tournament of independent fresh-context AI judges. Auditable, evidence-based verdicts. Claude Code skill.
Real-time multiplayer Mao card game with LLM-powered judge and dynamic rules engine
Thor — autonomous LLM evaluation system (LLMOps). Compares prompts/models with reproducible methodology, custom code, no external frameworks. Built for Sistema Liz municipal chatbot.
Add a description, image, and links to the ai-judge topic page so that developers can more easily learn about it.
To associate your repository with the ai-judge topic, visit your repo's landing page and select "manage topics."