Building a Multimodal Agent with the ADK, Amazon ECS Express, and Gemini Flash Live 3.1

DEV Community

Lint and Test the Main Python Code

The final step is to build, lint, and test the main Python code.

To Lint:

xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$make lint
ruff check .
All checks passed!
ruff format --check .
10 files already formatted
cd frontend && npm run lint
>frontend@0.0.0 lint
>eslint .

To Test:

xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$make test
python -m pytest
================================================================ test session starts ================================================================
platform linux -- Python 3.13.13, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/xbill/gemini-cli-aws/gemini31-ecsexpress
plugins: anyio-4.13.0, asyncio-1.3.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 8 items 
backend/app/biometric_agent/test_agent.py ..... [62%]
test_live_connection.py . [75%]
test_ws_backend.py . [87%]
test_ws_backend_v2.py . [100%]
================================================================= warnings summary ==================================================================
../../.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72
 /home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
 check_feature_enabled()
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================================================== 8 passed, 1 warning in 2.39s ============================================================
xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$

Running Locally

The main Python Code can then be run locally:

xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$make run
. ./biosync.sh
Local URL
http://127.0.0.1:8080/
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
 check_feature_enabled()
2026年04月11日 12:24:23,407 - INFO - System Config: 2.0 FPS, 10.0s Heartbeat
Serving static files from: /home/xbill/gemini-cli-aws/gemini31-ecsexpress/frontend/dist
INFO: Started server process [5299]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)

Then connect to the local front end:

Deploying to ECS Express

A utility script runs the deployment to AWS ECS Express. Use the deploy version from the local system:

aws login --remote
xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$ source save-aws-creds.sh 
Exporting AWS credentials...
Successfully saved credentials to .aws_creds
The Makefile will now automatically use these for deployments.
xbill@penguin:~/gemini-cli-aws/gemini31-ecsexpress$ 
The Makefile will now automatically use these for deployments.
Creating IAM roles for ECS Express Mode...
 0.0s 0.0s

You can validate the final result by checking the messages:


✦ The make status command successfully returned the status of your ECS service:
 * Service Name: biometric-scout
 * Status: ACTIVE
 * Endpoint: bi-854696446eb9414eb6d479ba78c68112.ecs.us-east-1.on.aws

Once the container is deployed- you can then get the endpoint:

 >make endpoint
✦ The make endpoint command will retrieve and display the public endpoint URL for the deployed application.
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓ Shell make endpoint [current working directory /home/xbill/gemini-cli-aws/gemini31-ecsexpress] (Execute the make endpoint command to get ... │
│ │
│ bi-854696446eb9414eb6d479ba78c68112.ecs.us-east-1.on.aws │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
✦ The make endpoint command returned: bi-854696446eb9414eb6d479ba78c68112.ecs.us-east-1.on.aws

The service will be visible in the AWS console:

Running the Web Interface

Start a connection to the deployed app:

https://bi-854696446eb9414eb6d479ba78c68112.ecs.us-east-1.on.aws

Then connect to the app :

Then use the Live model to process audio and video:

Finally — complete the sequence:

So What Really Changed in the Lab? It looks the Same to Me!

This project was a great learning experience to get first hand review of working with different LLM models and common issues with ADK and the nuances of working with a real-time Live model.

The key changes include:

Gemini Models vs Vertex AI Models

The original lab was using LLM models hosted on Vertex AI with PROJECT_ID and REGION authentication. The Gemini 3.1 Live Flash model is only current supported with the Gemini API- which requires an API key. When the authentication method and API is changed- the actual model support also changes — so you have to be careful that the model you want to use is available on that API.

Support for Gemini 3.1 Live

The API support for the 3.1 version of the Live model is slightly different- and required a translation layer for the current version of the ADK. This "monkey patch" acts as a workaround until full support of the 3.1 Live model is available in the ADK. There are several open issues in GitHub around these issues and they are noted in the GEMINI.md file.

Cloud Shell CORS over-ride

The recent versions of the ADK are more restrictive for running from the Cloud Shell.

ADK CLI Support

The ADK agent was extended to detect when running from the CLI and not throw errors related to the Live model.

Audio Support Client Side

The client side audio API was updated to the recommended Worklet implement ion to get audio processing off the main thread.

Re-Architected the Protocol to be Mostly JSON-Free

The original version wrapped the binary messages in JSON- which makes it easier to work with- but also introduces overhead. The updated code sends the Audio and Video streams as raw data.

Gemini CLI Development support

README.md and GEMINI.md were added to the project- as well as an Agent Skill for working with Live Models.

Gemini CLI Code Review

As a final step — Gemini CLI was used for a full code review of the project:

✦ Based on my comprehensive review of the "Alpha Rescue Drone - Biometric Security System," the codebase is exceptionally well-structured,
 follows modern engineering standards, and strictly adheres to the specialized requirements for Gemini 3.1 Flash Live and the Google Agent
 Development Kit (ADK).
Core Architecture & Integration
 The system excels at orchestrating a high-frequency, multimodal data stream. The use of a binary protocol (0x01 for audio, 0x02 for video)
 in useGeminiSocket.js minimizes overhead, while the monkey-patching utility in patch_adk.py demonstrates a sophisticated understanding of
 the evolving Gemini 3.1 Live API. This proactive approach ensures compatibility with features like send_realtime_input that are not yet
 natively handled by the standard ADK.
Backend Analysis
 - Agent Intelligence: The instructions in agent.py are precise and prioritize safety (offensive gesture detection) and hidden overrides
 ("Devil's Horns") effectively.
 - Resilience: The backend implements a heartbeat stimulus (CONTINUE_SURVEILLANCE) and a neural handshake, which are critical for
 maintaining model engagement during visual-only periods.
 - Model Fallback: The get_model_id() logic is clever, automatically switching to gemini-2.5-flash for CLI runs to prevent 404 errors,
 while defaulting to the high-performance gemini-3.1-flash-live-preview for production.
Frontend & UI/UX
 - Aesthetic: The "Neon Cyan" HUD in BiometricLock.jsx provides a high-fidelity, futuristic feel that aligns with the "Mission Alpha"
 theme.
 - Feedback Loop: The synchronization between the random sequence generation, visual matching, and the special "Heavy Metal" audio event
 creates a compelling user experience.
 - Efficiency: Utilizing toBlob with JPEG compression (0.6) for the 2 FPS video stream is an optimal choice for balancing visual clarity
 with network latency.
Testing & Maintainability
 - Validation: The project maintains a clean slate with 100% test and linting compliance. The inclusion of mock.sh is a significant
 developer productivity boost.
 - Standards: The use of Ruff for Python and ESLint for the frontend ensures long-term code quality.
Recommendations
 1. Connection Resilience: While the SYSTEM_ERROR closes the socket correctly, the frontend could benefit from a more graceful
 "Reconnecting..." state if the network fluctuates during a mission.
 2. Patch Management: As the Gemini Live API stabilizes, keep an eye on official ADK updates to deprecate the manual unrolling of
 media_chunks in patch_adk.py.
 3. Security: Verify that the .env variables are correctly injected into the Cloud Run environment via the Makefile or cloudbuild.yaml
 without being exposed in the build logs.
Overall, the system is a robust and creative implementation of real-time multimodal AI. Authentication Protocol: Stable. 🤘
 ? for shortcuts

Summary

The Agent Development Kit was used to enable a multi-modal agent using the Gemini Live Model. This Agent was tested locally with the CLI and then deployed to Amazon ECS Express. Several key take-aways and lessons learned were summarized from working with the transition to a new Live Gemini LLM model. Finally, Gemini CLI was used for a complete project code review.