Chrome DevTools for AI Agents - Real-time debugging, pause, inspect, and step through your AI agent execution.
PyPI version Downloads License: MIT Python 3.9+
Building AI agents is hard. Debugging them is harder.
"Your AI agent worked perfectly in testing. Then it hit production and called the wrong tool 14 times in a loop, burned 40ドル of API credits, and returned gibberish to your user. This is not a rare scenario. It's the default scenario."
Existing observability tools show you what happened after the fact. AgentDBG lets you watch it happen and stop it when things go wrong.
- Real-time Visualization - Watch your agent's execution unfold in real-time
- Pause & Resume - Stop execution at any point to inspect state
- Step-through Debugging - Advance one LLM call at a time
- Breakpoints - Pause on cost thresholds, errors, or custom conditions
- Cost Tracking - Real-time token and cost tracking per span
- Auto-instrumentation - Zero-config support for OpenAI, Anthropic, and LangChain
- Local-first - All data stays on your machine, sub-millisecond overhead
pip install agent-devtools
Run any Python script with AgentDBG instrumentation:
agentdbg run my_agent.py
This will:
- Auto-instrument OpenAI, Anthropic, and LangChain calls
- Start the debugging UI at http://localhost:8766
- Open your browser to the live trace viewer
For more control, use the @traced decorator or trace context manager:
from agentdbg import trace, traced, SpanKind # Using decorator @traced(name="process_query", kind=SpanKind.AGENT_STEP) def process_query(query: str) -> str: # Your agent logic here return result # Using context manager with trace(name="llm_call", kind=SpanKind.LLM_CALL) as span: response = call_llm(messages) span.output_data = {"response": response}
# Run a script with debugging agentdbg run script.py # Run with cost limit (pause when exceeded) agentdbg run script.py --cost-limit 1.0 # Run paused at start agentdbg run script.py --pause-on-start # Start server only (for external connections) agentdbg server # View recent traces agentdbg traces # Show statistics agentdbg stats # Clean up old traces agentdbg cleanup --days 7
- Pause - Stop execution at the current point
- Resume - Continue execution
- Step - Execute one span and pause again
- Clear - Remove all traces
Set breakpoints programmatically:
from agentdbg import get_debugger debugger = get_debugger() # Pause when cost exceeds 0ドル.50 debugger.state.add_breakpoint( lambda span: span.cost.total_cost > 0.50 ) # Pause on any error debugger.state.add_breakpoint( lambda span: span.error is not None ) # Pause on specific span name debugger.state.add_breakpoint( lambda span: "dangerous_tool" in span.name )
AgentDBG automatically tracks costs for popular models:
from agentdbg.config import MODEL_COSTS # Supported models: # - OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini # - Anthropic: claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-3-5-sonnet
Costs are displayed in real-time in the UI and can trigger breakpoints.
from openai import OpenAI from agentdbg.instrumentors import auto_instrument auto_instrument() # Done automatically by CLI client = OpenAI() # All calls are now traced automatically response = client.chat.completions.create(...)
from anthropic import Anthropic from agentdbg.instrumentors import auto_instrument auto_instrument() client = Anthropic() # All calls are now traced automatically response = client.messages.create(...)
from langchain_openai import ChatOpenAI from agentdbg.instrumentors.langchain_instrumentor import AgentDBGCallbackHandler # Use the callback handler handler = AgentDBGCallbackHandler() agent_executor = AgentExecutor( agent=agent, tools=tools, callbacks=[handler], )
Traces are stored locally in SQLite:
from agentdbg.storage import SQLiteStorage storage = SQLiteStorage(".agentdbg/traces.db") # Get recent traces traces = storage.get_traces(limit=10) # Get statistics stats = storage.get_stats() print(f"Total cost: ${stats['total_cost']:.2f}") print(f"Total tokens: {stats['total_tokens']:,}") # Clean up old data storage.delete_old_traces(days=7)
from agentdbg import DebugConfig, AgentDebugger config = DebugConfig( # Server host="127.0.0.1", port=8765, ui_port=8766, # Auto-pause auto_pause_on_error=True, auto_pause_on_cost=1.0, # Pause at 1ドル.00 auto_pause_on_tokens=100000, # Pause at 100k tokens # Data capture capture_inputs=True, capture_outputs=True, max_input_size=10000, max_output_size=10000, ) debugger = AgentDebugger(config=config)
┌─────────────────────────────────────────────────────────────┐
│ Your Agent Code │
│ (OpenAI, Anthropic, LangChain, Custom) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ AgentDBG SDK │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Instrumentor│ │ Core │ │ Storage │ │
│ │ (auto-wrap)│ │(trace/span) │ │ (SQLite) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ WebSocket Server │
│ (Real-time streaming) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Web UI │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────────┐ │
│ │ Trace List │ │ Span Tree │ │ Inspector (State/Cost) │ │
│ └────────────┘ └────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
| Feature | AgentDBG | Langfuse | LangSmith |
|---|---|---|---|
| Real-time pause/resume | ✅ | ❌ | ❌ |
| Step-through debugging | ✅ | ❌ | ❌ |
| Breakpoints | ✅ | ❌ | ❌ |
| Local-first | ✅ | ❌ | |
| Zero-config | ✅ | ||
| Open source | ✅ | ✅ | ❌ |
| Cost tracking | ✅ | ✅ | ✅ |
# Clone the repo git clone https://github.com/agentdbg/agentdbg.git cd agentdbg # Install in development mode pip install -e ".[dev]" # Run tests pytest # Run linting ruff check src tests mypy src
Contributions are welcome! Please read our Contributing Guide for details.
MIT License - see LICENSE for details.
Built with frustration and love by developers who've spent too many hours staring at logs wondering why their agent decided to search Google 47 times in a row.
Stop guessing why your agent failed. See every thought. Pause anywhere. Fix it live.