Local AI infrastructure. Your hardware. Your data. Your rules.
One installer gets you from bare metal to a fully running local AI stack — LLM inference, chat UI, voice agents, workflow automation, RAG, and privacy tools. No manual config. No dependency hell. No six months of piecing it together. Run one command, answer a few questions, everything works.
curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bash
Dream Server installer — auto-detects GPU, recommends model tier, and lets you choose your stack
The installer detects your hardware, picks the optimal model, and asks how deep you want to go.
Everything running, at a glance. GPU metrics, service health, one-click access to Chat, Voice, Workflows, Agents, and Documents.
Dream Server dashboard — GPU metrics, service status, feature cards
graph TB
subgraph User[" You "]
Browser(["Browser"])
Mic(["Microphone"])
API(["API Client"])
end
subgraph DreamServer["Dream Server (Docker Compose)"]
subgraph Core["Core"]
VLLM["vLLM · :8000<br/>LLM Inference"]
WebUI["Open WebUI · :3000<br/>Chat Interface"]
Dashboard["Dashboard · :3001<br/>GPU Metrics"]
end
subgraph Voice["Voice"]
Whisper["Whisper · :9000<br/>Speech → Text"]
Kokoro["Kokoro · :8880<br/>Text → Speech"]
LiveKit["LiveKit · :7880<br/>WebRTC"]
VoiceAgent["Voice Agent"]
end
subgraph RAGp["RAG"]
Qdrant["Qdrant · :6333<br/>Vector DB"]
Embeddings["Embeddings · :8090"]
end
subgraph Workflows["Workflows"]
N8N["n8n · :5678<br/>400+ Integrations"]
end
subgraph Agents["Agents"]
OpenClaw["OpenClaw · :7860<br/>Multi-Agent"]
ToolProxy["Tool Proxy<br/>vLLM Bridge"]
end
subgraph Privacy["Privacy"]
Shield["Privacy Shield · :8085<br/>PII Redaction"]
end
end
Browser --> WebUI
Browser --> Dashboard
Browser --> N8N
Mic --> LiveKit
API --> VLLM
WebUI --> VLLM
VoiceAgent --> Whisper
VoiceAgent --> Kokoro
VoiceAgent --> VLLM
LiveKit --> VoiceAgent
OpenClaw --> ToolProxy
ToolProxy --> VLLM
Shield -.->|PII scrubbed| VLLM
style Core fill:#e8f0fe,stroke:#1a73e8,color:#1a1a1a
style Voice fill:#fce8e6,stroke:#d93025,color:#1a1a1a
style RAGp fill:#e6f4ea,stroke:#1e8e3e,color:#1a1a1a
style Workflows fill:#fef7e0,stroke:#f9ab00,color:#1a1a1a
style Agents fill:#f3e8fd,stroke:#9334e6,color:#1a1a1a
style Privacy fill:#e8eaed,stroke:#5f6368,color:#1a1a1a
The installer auto-detects your GPU and activates the right profiles. Core services start immediately; voice, RAG, workflows, and agents activate based on your hardware and preferences.
Hobbyists — Want local ChatGPT without subscriptions? Install Dream Server, open localhost:3000, start chatting. Voice mode, document Q&A, and workflow automation are one toggle away.
Developers — Building AI agents? Dream Server gives you a local OpenAI-compatible API (vLLM), multi-agent coordination (OpenClaw), and a workflow engine (n8n) — all on your GPU. No API keys, no rate limits, no cost per token.
Teams — Need private AI infrastructure? Everything runs on your hardware. The Privacy Shield scrubs PII before anything leaves your network. Deploy once, use from any device on your LAN.
| Component | What It Does |
|---|---|
| vLLM | GPU-accelerated LLM inference with continuous batching — auto-selects 7B to 72B models for your hardware |
| Open WebUI | Full-featured chat interface with conversation history, model switching, web search |
| Dashboard | Real-time GPU metrics (VRAM, temp, utilization), service health, model management |
| Whisper | Speech-to-text — local, fast, private |
| Kokoro | Text-to-speech — natural-sounding voices, no cloud |
| LiveKit | Real-time WebRTC voice conversations — talk to your AI like a phone call |
| n8n | Visual workflow automation with 400+ integrations (GitHub, Slack, email, webhooks) |
| Qdrant | Vector database for document Q&A (RAG) |
| OpenClaw | Multi-agent AI framework — agents coordinating autonomously on your GPU |
| Privacy Shield | PII redaction proxy — scrubs personal data before any external API call |
| Tier | VRAM | Model | Example GPUs |
|---|---|---|---|
| Entry | <12GB | Qwen2.5-7B | RTX 3080, RTX 4070 |
| Prosumer | 12–20GB | Qwen2.5-14B-AWQ | RTX 3090, RTX 4080 |
| Pro | 20–40GB | Qwen2.5-32B-AWQ | RTX 4090, A6000 |
| Enterprise | 40GB+ | Qwen2.5-72B-AWQ | A100, H100, multi-GPU |
Bootstrap mode: Chat in 2 minutes. A tiny model starts instantly while the full model downloads in the background. Hot-swap with zero downtime when ready.
| Dream Server | Ollama + Open WebUI | LocalAI | |
|---|---|---|---|
| Full-stack install (LLM + voice + workflows + RAG + privacy) | One command | Manual assembly | Manual assembly |
| Hardware auto-detection + model selection | Yes | No | No |
| Voice agents (STT + TTS + WebRTC) | Built in | No | Partial |
| Inference engine | vLLM (continuous batching) | llama.cpp | llama.cpp |
| Workflow automation | n8n (400+ integrations) | No | No |
| PII redaction | Built in | No | No |
| Multi-agent framework | OpenClaw | No | No |
Ollama is great for running models locally. Dream Server is a complete AI platform — inference, voice, workflows, RAG, agents, privacy, and monitoring in one installer.
Standalone tools for running persistent AI agents in production. Each works independently — grab what you need.
| Tool | Purpose |
|---|---|
| Guardian | Self-healing process watchdog — monitors services, auto-restores from backup, runs as root so agents can't kill it |
| Memory Shepherd | Periodic memory reset to prevent identity drift in long-running agents |
| Token Spy | API cost monitoring with real-time dashboard and auto-kill for runaway sessions |
| vLLM Tool Proxy | Makes local vLLM tool calling work with OpenClaw — SSE re-wrapping, extraction, loop protection |
| LLM Cold Storage | Archives idle HuggingFace models to free disk, keeps them resolvable via symlink |
These tools were born from the OpenClaw Collective — 3 AI agents running autonomously on local GPUs, producing 3,464 commits in 8 days. Dream Server packages the infrastructure they built into something anyone can use.
| Quickstart | Step-by-step install guide with troubleshooting |
| FAQ | Common questions, hardware advice, configuration |
| Hardware Guide | GPU recommendations with real prices |
| Cookbook | Recipes: voice agents, RAG pipelines, code assistant, privacy proxy |
| Architecture | Deep dive into the system design |
| Contributing | How to contribute to Lighthouse AI |
Windows: install.ps1 handles WSL2 + Docker + NVIDIA drivers automatically.
Apache 2.0 — see LICENSE. Use it, modify it, ship it.
Built by Lightheart Labs and the OpenClaw Collective.