Your AI on WhatsApp — Fully Local, Powered by Gemma

DEV Community

\ -d '{"query":"What is cross-validation?"}' | jq -r .output

Keep this terminal open. The first crew run may take several minutes.

Part 2 — Pull Gemma 4 E2B

ollama pull gemma4:e2b
ollama run gemma4:e2b "Reply in one sentence: what is Gemma 4?"

Recommended sampling (Ollama may already apply defaults): temperature=1, top_p=0.95, top_k=64.

Part 3 — Install OpenClaw

Node version (required)

OpenClaw needs Node >= 22.12. If node -v shows v20, switch with nvm (you may already have 22 installed):

cd guides/openclaw-gemma-rag
source ./use-node22.sh # uses .nvmrc → 22.22.3
node -v # must be v22.12.0 or higher

Optional — make Node 22 the default in new terminals:

nvm alias default 22

npm install -g openclaw@latest
openclaw onboard --install-daemon

Follow prompts for workspace, auth, and optional channels. See Getting started.

Set the primary model:

export OLLAMA_API_KEY="ollama-local"
openclaw models list --provider ollama
openclaw models set ollama/gemma4:e2b

Config snippet

Copy fields from config/openclaw.snippet.json5 in this guide into ~/.openclaw/openclaw.json.

Critical points:

baseUrl: http://127.0.0.1:11434 — no /v1 suffix
api: "ollama" — native tool calling
agents.defaults.model.primary: "ollama/gemma4:e2b"

Restart:

openclaw gateway restart
openclaw gateway status

Part 4 — Install the agentic-rag skill

From this guide directory:

cd guides/openclaw-gemma-rag
chmod +x install-skill.sh skills/agentic-rag/scripts/*.sh
./install-skill.sh

This copies to ~/.openclaw/workspace/skills/agentic-rag/.

Alternative (if your CLI supports it):

openclaw skills install ./guides/openclaw-gemma-rag/skills/agentic-rag --global

Enable in config:

{skills:{entries:{"agentic-rag":{enabled:true,env:{RAG_API_URL:"http://127.0.0.1:8001"},},},},}

Optional allowlist so only this skill is injected:

{agents:{defaults:{skills:["agentic-rag"],},},}

Restart the gateway after skill or config changes.

Skill behavior

The skill teaches OpenClaw to run:

~/.openclaw/workspace/skills/agentic-rag/scripts/rag_query.sh "user question"

That POSTs to LitServe and prints the crew answer. The Gemma model decides when to use the skill; the RAG crew uses the same OLLAMA_MODEL=ollama/gemma4:e2b from guides/qwen-agentic-rag/.env (see env.rag.example).

Part 5 — End-to-end test

CLI (no channel)

openclaw agent --message "Using the agentic RAG knowledge base: explain cross-validation in 3 bullets." --thinking low

Watch the gateway logs — you should see an exec invoking rag_query.sh.

Manual script test

export RAG_API_URL=http://127.0.0.1:8001
./skills/agentic-rag/scripts/rag_query.sh "What is regularization?"

Health check

./skills/agentic-rag/scripts/rag_health.sh

Part 6 — Connect a channel (optional)

Example: Telegram

Create a bot via @BotFather
During openclaw onboard or openclaw configure, add the Telegram channel token
Keep DM pairing enabled (dmPolicy: "pairing") until you trust exposure
Approve yourself: openclaw pairing approve telegram

Send: "Search the ML FAQ: what is gradient descent?"

Flow: Telegram → Gateway → Gemma → agentic-rag skill → RAG API → reply on Telegram.

Channel docs: OpenClaw Channels.

Security checklist

Treat inbound DMs as untrusted — keep pairing on for production-adjacent setups
exec (used by the RAG skill) is powerful — do not expose the gateway to the public internet without Security and Exposure runbook
Run openclaw doctor after config changes
RAG API binds to localhost by default — keep it that way

Troubleshooting

| Symptom | Fix |
|---------|-----|
| `connection refused` on :8001 | Start `python server.py` in qwen-agentic-rag |
| RAG very slow | Normal on laptop; reduce parallel Ollama loads |
| OpenClaw ignores RAG | Confirm skill installed, `enabled: true`, gateway restarted; ask explicitly to "use agentic RAG" |
| `ollama/gemma4:e2b` not found | `ollama pull gemma4:e2b`; check `openclaw models list` |
| Tool calling errors | Ensure `api: "ollama"` and no `/v1` on baseUrl |
| `openclaw requires Node >=22.12.0` | Run `source guides/openclaw-gemma-rag/use-node22.sh` or `nvm use 22` before any `openclaw` command |
| OOM on 16GB Mac | Only run `gemma4:e2b`; quit other Ollama models (`ollama ps`) |
| Skill `curl` fails | `brew install jq` or apt install jq |

What’s next

Add your own documents in guides/qwen-agentic-rag/rag_code.py and re-run setup_vectordb.py
Publish a second OpenClaw skill for Gradio (ui.py) health checks
Route work vs personal agents with multi-agent routing

Summary

| Component | You run |
|-----------|---------|
| Ollama | `gemma4:e2b` (chat + RAG) |
| RAG | `guides/qwen-agentic-rag/server.py` |
| OpenClaw | `openclaw gateway` (daemon) |
| Skill | `agentic-rag` → `rag_query.sh` → `/predict` |

You now have a local-first assistant: Gemma for conversation, CrewAI RAG for grounded ML research — no cloud LLM required for either layer.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

Website: https://www.techlatest.net/

Newsletter: https://substack.com/@techlatest

Twitter: https://twitter.com/TechlatestNet

LinkedIn: https://www.linkedin.com/in/techlatest-net/

YouTube:https://www.youtube.com/@techlatest_net/

Blogs: https://medium.com/@techlatest.net

Reddit Community: https://www.reddit.com/user/techlatest_net/