Skip to main content
Hindsight is State-of-the-Art on Memory for AI Agents | Read the paper β†’
πŸ€–
Using a coding agent? Run this to install the Hindsight docs skill:
npx skills add https://github.com/vectorize-io/hindsight --skill hindsight-docs

Pydantic AI

Persistent memory tools for Pydantic AI agents via Hindsight. Give your agents long-term memory with retain, recall, and reflect β€” all async-native with no thread-pool hacks.

View Changelog β†’

Features​

  • Async-Native Tools β€” Uses Pydantic AI's async tool interface directly (aretain, arecall, areflect)
  • Memory Instructions β€” Auto-inject relevant memories into every agent run via instructions=[...]
  • Three Memory Tools β€” Retain (store), Recall (search), Reflect (synthesize) β€” include any combination
  • Simple Configuration β€” Configure once globally, or pass a client directly
  • Lightweight β€” Depends on pydantic-ai-slim to avoid pulling in all model providers

Installation​

pip install hindsight-pydantic-ai

Quick Start​

Recommended: Hindsight Cloud

Sign up free and grab an API key β€” no self-hosting required.

from hindsight_client import Hindsight
from hindsight_pydantic_ai import create_hindsight_tools, memory_instructions
from pydantic_ai import Agent

client = Hindsight(base_url="https://api.hindsight.vectorize.io", api_key="hsk_...")

agent = Agent(
"openai:gpt-4o",
tools=create_hindsight_tools(client=client, bank_id="user-123"),
instructions=[memory_instructions(client=client, bank_id="user-123")],
)

result =await agent.run("What do you remember about my preferences?")
print(result.output)

The agent now has three tools it can call:

  • hindsight_retain β€” Store information to long-term memory
  • hindsight_recall β€” Search long-term memory for relevant facts
  • hindsight_reflect β€” Synthesize a reasoned answer from memories

The memory_instructions callable automatically recalls relevant memories and injects them into the system prompt on every run.

Self-hosting (local development)​

If you're running Hindsight locally with ./scripts/dev/start-api.sh, swap the URL:

client = Hindsight(base_url="http://localhost:8888")

See the installation guide for self-hosting setup.

Tools Only (No Auto-Injection)​

If you want the agent to decide when to use memory rather than always injecting context:

agent = Agent(
"openai:gpt-4o",
tools=create_hindsight_tools(client=client, bank_id="user-123"),
)

Instructions Only (No Tools)​

If you just want memories auto-injected without giving the agent explicit memory tools:

agent = Agent(
"openai:gpt-4o",
instructions=[memory_instructions(client=client, bank_id="user-123")],
)

Selecting Tools​

Include only the tools you need:

tools = create_hindsight_tools(
client=client,
bank_id="user-123",
include_retain=True,
include_recall=True,
include_reflect=False,# Omit reflect
)

Global Configuration​

Instead of passing a client to every call, configure once:

from hindsight_pydantic_ai import configure, create_hindsight_tools

configure(
hindsight_api_url="https://api.hindsight.vectorize.io",# Hindsight Cloud (default)
api_key="your-api-key",# Or set HINDSIGHT_API_KEY env var
budget="mid",# Recall budget: low/mid/high
max_tokens=4096,# Max tokens for recall results
tags=["env:prod"],# Tags for stored memories
recall_tags=["scope:global"],# Tags to filter recall
recall_tags_match="any",# Tag match mode: any/all/any_strict/all_strict
)

# Now create tools without passing client β€” uses global config
tools = create_hindsight_tools(bank_id="user-123")

Per-Tool Overrides​

Constructor arguments override global configuration:

tools = create_hindsight_tools(
bank_id="user-123",
budget="high",# Override global budget
max_tokens=8192,# Override global max_tokens
tags=["session:abc"],# Override global tags
)

Memory Instructions Options​

Customize what memories get injected and how:

instructions_fn = memory_instructions(
client=client,
bank_id="user-123",
query="relevant context about the user",# What to search for
budget="low",# Keep it fast
max_results=5,# Limit injected memories
max_tokens=4096,# Max recall tokens
prefix="Relevant memories:\n",# Text before the memory list
tags=["scope:global"],# Filter by tags
tags_match="any",# Tag match mode
)

API Reference​

create_hindsight_tools()​

ParameterDefaultDescription
bank_idrequiredHindsight memory bank ID
clientNonePre-configured Hindsight client
hindsight_api_urlNoneAPI URL (used if no client provided)
api_keyNoneAPI key (used if no client provided)
budget"mid"Recall/reflect budget level (low/mid/high)
max_tokens4096Maximum tokens for recall results
tagsNoneTags applied when storing memories
recall_tagsNoneTags to filter when searching
recall_tags_match"any"Tag matching mode
include_retainTrueInclude the retain (store) tool
include_recallTrueInclude the recall (search) tool
include_reflectTrueInclude the reflect (synthesize) tool

memory_instructions()​

ParameterDefaultDescription
bank_idrequiredHindsight memory bank ID
clientNonePre-configured Hindsight client
hindsight_api_urlNoneAPI URL (used if no client provided)
api_keyNoneAPI key (used if no client provided)
query"relevant context about the user"Recall query for memory injection
budget"low"Recall budget level
max_results5Maximum memories to inject
max_tokens4096Maximum tokens for recall results
prefix"Relevant memories:\n"Text prepended before memory list
tagsNoneTags to filter recall results
tags_match"any"Tag matching mode

configure()​

ParameterDefaultDescription
hindsight_api_urlHindsight Cloud (https://api.hindsight.vectorize.io)Hindsight API URL
api_keyHINDSIGHT_API_KEY envAPI key for authentication
budget"mid"Default recall budget level
max_tokens4096Default max tokens for recall
tagsNoneDefault tags for retain operations
recall_tagsNoneDefault tags to filter recall
recall_tags_match"any"Default tag matching mode
verboseFalseEnable verbose logging

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /