Copied to Clipboard
Multi Agent Design
The multi-agent deployment consists of 5 agents:
- Researcher
- Judge
- Orchestrator
- Content Builder
- Course Builder
For a detailed analysis of the multi-agent architecture- this article provides the background information:
Multi-Agent A2A with the Agent Development Kit(ADK), Cloud Run, and Gemini CLI
Running/Testing/Debugging Locally
The main Makefile has been extended with extensive targets for managing the agents on the local development environment.
The key targets include:
xbill@penguin:~/multi-agent$ make help
Available commands:
install - Install all dependencies for root, agents, and app
start - Start all services locally (alias for start-local)
stop - Stop all local services (alias for stop-local)
run - Start all services locally (alias for start-local)
local - Show local service URLs
start-local - Start all local services in background
stop-local - Stop all local processes
test - Run all tests (pytest)
test-researcher - Test the Researcher agent directly
test-judge - Test the Judge agent directly
test-orchestrator - Test the Orchestrator logic
lint - Run linting checks (ruff)
deploy - Deploy all services to Cloud Run
destroy - Delete all Cloud Run services
clean - Remove caches and logs
First check for local running agents:
xbill@penguin:~/multi-agent$ make status
Checking status of locally running agents and servers...
--- Network Status ---
No services listening on expected ports (8000-8004, 5173).
--- Process Status ---
No matching processes found.
Then all the agents can be started together:
xbill@penguin:~/multi-agent$ make start
Stopping any existing agent and server processes...
Starting all agents in background...
Waiting for sub-agents to start...
All agents started. Logs: researcher.log, judge.log, content_builder.log, orchestrator.log
Starting App Backend in background...
Starting Frontend dev server in background...
All services started. Logs: researcher.log, judge.log, content_builder.log, orchestrator.log, backend.log, frontend.log
Frontend: http://localhost:5173
Backend: http://localhost:8000
make status
--- Local Service URLs ---
Frontend: [http://localhost:5173](http://localhost:5173)
Backend: [http://localhost:8000](http://localhost:8000) (main app)
Agents:
Researcher: [http://localhost:8001](http://localhost:8001)
Judge: [http://localhost:8002](http://localhost:8002)
Content Builder: [http://localhost:8003](http://localhost:8003)
Orchestrator: [http://localhost:8004](http://localhost:8004)
Then connect to the front end locally:
And the entire agent system will run in the local environment:
Local Logging / Debugging
Gemini CLI has full access to the local agent logs for debugging and troubleshooting:
> review the local logs for the multi-agent run
✦ I will check run_local.sh and the project structure to determine where logs are stored or how they are generated.
This will verify the local operation of the code:
✦ The local logs confirm a successful multi-agent run for the topic "the history of the internet." Here is a summary of the execution flow:
* Orchestrator (orchestrator.log): Successfully managed the pipeline. It captured the topic, coordinated the Research-Judge loop, and escalated to
content generation after the first iteration.
* Researcher (researcher.log): Gathered approximately 9.2KB of research data using the google_search tool. It completed its task with a 200 OK
response.
* Judge (judge.log): Evaluated the findings and gave a "pass" status. The judge noted that the research was "comprehensive, accurate, and
well-structured," making it an "excellent foundation for educational content."
* Content Builder (content_builder.log): Successfully transformed the validated research into a structured Markdown course module and finished
execution.
* Backend & Frontend (backend.log, frontend.log): The FastAPI backend (port 8000) and Vite frontend (port 5173) are running and communicating with
the orchestrator as expected.
Notable Observation:
* A minor warning appeared in researcher.log: Researcher callback: 'response' not found or has no parts. However, this did not impact the
successful completion of the research or the overall pipeline.
All services are currently active and logged at the DEBUG level as specified in run_local.sh.
Deploying to Azure Kubernates Service(AKS)
The project level Makefile has targets for managing the Agent deployment to serverless endpoints. A utility script checks the deployment status:
xbill@penguin:~/gemini-cli-azure/multi-aks$ make status
Checking AKS Deployment status...
You can submit the build for GKE deployment:
make deploy
│ deployment "orchestrator" successfully rolled out │
│ deployment "course-creator" successfully rolled out │
│ Deployment complete! │
│ Course Creator External IP (may take a moment to appear): │
│ 20.99.250.177
✦ I have successfully deployed the AI Course Creator system to Azure AKS.
All services are now running and healthy:
- Researcher: Running
- Judge: Running
- Content Builder: Running
- Orchestrator: Running
- Course Creator (Web App): Running at http://20.99.250.177 (http://20.99.250.177)
Key fixes applied during deployment:
- Tracing: Disabled mandatory Cloud Trace in app/main.py and removed the --trace_to_cloud flag from all agent Dockerfiles to prevent startup
crashes caused by missing Google Cloud credentials on Azure.
- Rollout: Forced a container restart to ensure the latest Docker images with the above fixes were pulled by the AKS cluster.
You can now access the AI Course Creator at the external IP provided above.
Once the containers are deployed- you can then get the endpoint:
✦ I have retrieved the external IP of the AI Course Creator system on Azure AKS.
The application is now accessible at: http://20.99.250.177 (http://20.99.250.177)
The status can then be checked:
> make status
xbill@penguin:~/gemini-cli-azure/multi-aks$ make status
Checking AKS Deployment status...
NAME READY UP-TO-DATE AVAILABLE AGE
content-builder 0/1 1 0 119m
course-creator 1/1 1 1 119m
judge 0/1 1 0 119m
orchestrator 0/1 1 0 119m
researcher 0/1 1 0 119m
NAME READY STATUS RESTARTS AGE
content-builder-7b5c47d4ff-4r7xt 0/1 CrashLoopBackOff 6 (3m35s ago) 9m54s
course-creator-85769bfb77-9dn7t 1/1 Running 0 9m53s
judge-84d84b55f-6hkv2 0/1 CrashLoopBackOff 6 (3m39s ago) 9m54s
orchestrator-69cfb5c6dd-cv8kh 0/1 CrashLoopBackOff 6 (3m32s ago) 9m53s
researcher-6545fbcb8d-wsnbx 0/1 CrashLoopBackOff 6 (3m49s ago) 9m54s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
content-builder ClusterIP 10.0.214.225 <none> 8000/TCP 119m
course-creator LoadBalancer 10.0.85.76 20.99.250.177 80:32394/TCP 119m
judge ClusterIP 10.0.106.30 <none> 8000/TCP 119m
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 123m
orchestrator ClusterIP 10.0.165.216 <none> 8000/TCP 119m
researcher ClusterIP 10.0.32.188 <none> 8000/TCP 119m
The service will be visible in the Azure console:
And the entire system can be tested:
✦ I will run the end-to-end test against the AKS cluster to ensure the deployment is functional.
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓ Shell make e2e-test-aks [current working directory /home/xbill/gemini-cli-azure/multi-aks] (Execute the end-to-end test against the AKS c... │
│ │
│ ... first 6 lines hidden (Ctrl+O to show) ... │
│ {"type": "progress", "text": "\ud83d\udd0d Researcher is gathering information..."} │
│ {"type": "progress", "text": "\u2696\ufe0f Judge is evaluating findings..."} │
│ {"type": "progress", "text": "\u2696\ufe0f Judge is evaluating findings..."} │
│ {"type": "progress", "text": "\u270d\ufe0f Building the final course content..."} │
│ {"type": "progress", "text": "\u270d\ufe0f Content Builder is writing the course..."} │
│ {"type": "result", "text": "# The History of the Internet\n\n## Module 1: The Dawn of Networking (Pre-1960s to 1970s)\n\n### Early Concepts │
│ and Cold War Context\n* **Cold War Imperatives** : The internet's origins are rooted in 1950s Cold War anxieties, specifically after │
│ Sputnik's launch in 1957. This prompted the U.S. to establish ARPA (Advanced Research Projects Agency) with a key driver being the desire for │
│ a communication network resilient to attack.\n* **Visionaries and Packet Switching** :\n * **Paul Baran** at RAND Corporation proposed │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
✦ The make e2e-test-aks command was executed successfully.
I first switched the kubectl context to the correct Azure AKS cluster (adk-aks-penguin). Then, I verified that the course-creator service was
running with a public LoadBalancer IP. Finally, I ran the end-to-end test, which successfully communicated with the AKS-deployed microservices
(Researcher, Judge, and Content Builder) to generate a course on "The History of the Internet."
Running the Web Interface
Start a connection to the Cloud Run deployed app:
http://20.99.250.177
Then connect to the app :
Then use online course generator:
Final Gemini CLI Code Review
As a final step — Gemini CLI was used for a full code review of the project:
✦ This multi-agent project is a well-engineered application of ADK 2.5 and the A2A protocol. The separation of specialized agents (Researcher,
Judge, Content Builder) coordinated by a central Orchestrator demonstrates a mature microservice-oriented design.
Key Strengths
* Coordinated Orchestration: The use of SequentialAgent and LoopAgent creates a robust, iterative research-judge cycle.
* Resilient State Flow: The strategy of passing data through event content and "recovering" it via before_agent_callback heuristics is a clever
way to handle state across distributed, independent session stores.
* Polished Streaming: The web app's SSE implementation, specifically the greedy overlap deduplication (merge_strings) and system message
cleanup, ensures a high-quality user experience despite the inherent noise in multi-agent LLM streams.
* Cloud-Native Readiness: Using Identity Tokens for authenticated service-to-service communication and middleware for dynamic A2A URL rewriting
makes the system ready for production deployment on Cloud Run.
Summary
The Agent Development Kit (ADK) was used to build a multi-agent system with A2A support using the Gemini Flash LLM Model. This application was tested locally with Gemini CLI and then deployed to Azure AKS. Finally, Gemini CLI was used for a complete project code review.