Copied to Clipboard
Keep this terminal open. The first crew run may take several minutes.
Part 2 — Pull Gemma 4 E2B
ollama pull gemma4:e2b
ollama run gemma4:e2b "Reply in one sentence: what is Gemma 4?"
Recommended sampling (Ollama may already apply defaults): temperature=1, top_p=0.95, top_k=64.
Part 3 — Install OpenClaw
Node version (required)
OpenClaw needs Node >= 22.12. If node -v shows v20, switch with nvm (you may already have 22 installed):
cd guides/openclaw-gemma-rag
source ./use-node22.sh # uses .nvmrc → 22.22.3
node -v # must be v22.12.0 or higher
Optional — make Node 22 the default in new terminals:
nvm alias default 22
npm install -g openclaw@latest
openclaw onboard --install-daemon
Follow prompts for workspace, auth, and optional channels. See Getting started.
Set the primary model:
export OLLAMA_API_KEY="ollama-local"
openclaw models list --provider ollama
openclaw models set ollama/gemma4:e2b
Config snippet
Copy fields from config/openclaw.snippet.json5 in this guide into ~/.openclaw/openclaw.json.
Critical points:
Restart:
openclaw gateway restart
openclaw gateway status
Part 4 — Install the agentic-rag skill
From this guide directory:
cd guides/openclaw-gemma-rag
chmod +x install-skill.sh skills/agentic-rag/scripts/*.sh
./install-skill.sh
This copies to ~/.openclaw/workspace/skills/agentic-rag/.
Alternative (if your CLI supports it):
openclaw skills install ./guides/openclaw-gemma-rag/skills/agentic-rag --global
Enable in config:
{skills:{entries:{"agentic-rag":{enabled:true,env:{RAG_API_URL:"http://127.0.0.1:8001"},},},},}
Optional allowlist so only this skill is injected:
{agents:{defaults:{skills:["agentic-rag"],},},}
Restart the gateway after skill or config changes.
Skill behavior
The skill teaches OpenClaw to run:
~/.openclaw/workspace/skills/agentic-rag/scripts/rag_query.sh "user question"
That POSTs to LitServe and prints the crew answer. The Gemma model decides when to use the skill; the RAG crew uses the same OLLAMA_MODEL=ollama/gemma4:e2b from guides/qwen-agentic-rag/.env (see env.rag.example).
Part 5 — End-to-end test
CLI (no channel)
openclaw agent --message "Using the agentic RAG knowledge base: explain cross-validation in 3 bullets." --thinking low
Watch the gateway logs — you should see an exec invoking rag_query.sh.
Manual script test
export RAG_API_URL=http://127.0.0.1:8001
./skills/agentic-rag/scripts/rag_query.sh "What is regularization?"
Health check
./skills/agentic-rag/scripts/rag_health.sh
Part 6 — Connect a channel (optional)
Example: Telegram
- Create a bot via @BotFather
- During openclaw onboard or openclaw configure, add the Telegram channel token
- Keep DM pairing enabled (dmPolicy: "pairing") until you trust exposure
- Approve yourself: openclaw pairing approve telegram
Send: "Search the ML FAQ: what is gradient descent?"
Flow: Telegram → Gateway → Gemma → agentic-rag skill → RAG API → reply on Telegram.
Channel docs: OpenClaw Channels.
Security checklist
- Treat inbound DMs as untrusted — keep pairing on for production-adjacent setups
- exec (used by the RAG skill) is powerful — do not expose the gateway to the public internet without Security and Exposure runbook
- Run openclaw doctor after config changes
- RAG API binds to localhost by default — keep it that way
Troubleshooting
| Symptom | Fix |
|---------|-----|
| `connection refused` on :8001 | Start `python server.py` in qwen-agentic-rag |
| RAG very slow | Normal on laptop; reduce parallel Ollama loads |
| OpenClaw ignores RAG | Confirm skill installed, `enabled: true`, gateway restarted; ask explicitly to "use agentic RAG" |
| `ollama/gemma4:e2b` not found | `ollama pull gemma4:e2b`; check `openclaw models list` |
| Tool calling errors | Ensure `api: "ollama"` and no `/v1` on baseUrl |
| `openclaw requires Node >=22.12.0` | Run `source guides/openclaw-gemma-rag/use-node22.sh` or `nvm use 22` before any `openclaw` command |
| OOM on 16GB Mac | Only run `gemma4:e2b`; quit other Ollama models (`ollama ps`) |
| Skill `curl` fails | `brew install jq` or apt install jq |
What’s next
- Add your own documents in guides/qwen-agentic-rag/rag_code.py and re-run setup_vectordb.py
- Publish a second OpenClaw skill for Gradio (ui.py) health checks
- Route work vs personal agents with multi-agent routing
Summary
| Component | You run |
|-----------|---------|
| Ollama | `gemma4:e2b` (chat + RAG) |
| RAG | `guides/qwen-agentic-rag/server.py` |
| OpenClaw | `openclaw gateway` (daemon) |
| Skill | `agentic-rag` → `rag_query.sh` → `/predict` |
You now have a local-first assistant: Gemma for conversation, CrewAI RAG for grounded ML research — no cloud LLM required for either layer.
Thank you so much for reading
Like | Follow | Subscribe to the newsletter.
Catch us on
Website: https://www.techlatest.net/
Newsletter: https://substack.com/@techlatest
Twitter: https://twitter.com/TechlatestNet
LinkedIn: https://www.linkedin.com/in/techlatest-net/
YouTube:https://www.youtube.com/@techlatest_net/
Blogs: https://medium.com/@techlatest.net
Reddit Community: https://www.reddit.com/user/techlatest_net/