Skip to main content
Hindsight is State-of-the-Art on Memory for AI Agents | Read the paper β†’
πŸ€–
Using a coding agent? Run this to install the Hindsight docs skill:
npx skills add https://github.com/vectorize-io/hindsight --skill hindsight-docs

AutoGen

Persistent long-term memory for AutoGen agents via Hindsight. Provides FunctionTool instances that plug directly into AutoGen's AssistantAgent.

Features​

  • Memory Tools β€” retain, recall, and reflect as AutoGen FunctionTool instances compatible with AssistantAgent(tools=[...])
  • Async-Native β€” Uses aretain, arecall, areflect directly β€” works seamlessly in AutoGen's async runtime
  • Selective Tools β€” Include only the tools you need with include_retain/recall/reflect flags
  • Tag-Based Scoping β€” Partition memories by topic, session, or user with tags
  • Global Configuration β€” Configure once with configure(), create tools anywhere

Installation​

pip install hindsight-autogen autogen-agentchat "autogen-ext[openai]"

hindsight-autogen pulls in autogen-core and hindsight-client. You also need autogen-agentchat for AssistantAgent and autogen-ext[openai] for the OpenAI model client.

Quick Start​

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
from hindsight_client import Hindsight
from hindsight_autogen import create_hindsight_tools

asyncdefmain():
client = Hindsight(base_url="http://localhost:8888")
await client.acreate_bank(bank_id="user-123")

model_client = OpenAIChatCompletionClient(model="gpt-4o")
tools = create_hindsight_tools(client=client, bank_id="user-123")

agent = AssistantAgent(
name="assistant",
model_client=model_client,
tools=tools,
)

# Store a memory
result =await agent.run(task="Remember that I prefer dark mode")
print(result.messages[-1].content)

# Hindsight processes retained content asynchronously (fact extraction,
# entity resolution, embeddings). A brief pause ensures memories are
# searchable before the next recall. In production, this delay is only
# needed when retain and recall happen back-to-back in the same script.
await asyncio.sleep(3)

# Recall it later
result =await agent.run(task="What are my UI preferences?")
print(result.messages[-1].content)

# Clean up
await client.aclose()
await model_client.close()

asyncio.run(main())
Jupyter Notebooks

If you're running in a Jupyter notebook, you don't need asyncio.run() β€” just use await directly in cells since the notebook already has an active event loop.

The agent gets three tools it can call:

  • hindsight_retain β€” Store information to long-term memory
  • hindsight_recall β€” Search long-term memory for relevant facts
  • hindsight_reflect β€” Synthesize a reasoned answer from memories

Selecting Tools​

Include only the tools you need:

tools = create_hindsight_tools(
client=client,
bank_id="user-123",
include_retain=True,
include_recall=True,
include_reflect=False,# Omit reflect
)

Global Configuration​

Instead of passing a client to every call, configure once:

from hindsight_autogen import configure, create_hindsight_tools

configure(
hindsight_api_url="http://localhost:8888",
api_key="your-api-key",# Or set HINDSIGHT_API_KEY env var
budget="mid",# Recall budget: low/mid/high
max_tokens=4096,# Max tokens for recall results
tags=["env:prod"],# Tags for stored memories
recall_tags=["scope:global"],# Tags to filter recall
recall_tags_match="any",# Tag match mode
)

# Now create tools without passing client β€” uses global config
tools = create_hindsight_tools(bank_id="user-123")

Memory Scoping with Tags​

Use tags to partition memories by topic, session, or user:

# Store memories tagged by source
tools = create_hindsight_tools(
client=client,
bank_id="user-123",
tags=["source:chat","session:abc"],
recall_tags=["source:chat"],
recall_tags_match="any",
)

Production Patterns​

Error Handling​

Tools raise HindsightError on failure, which AutoGen surfaces to the agent as a tool error. Wrap agent calls for graceful degradation:

from hindsight_autogen.errors import HindsightError

try:
result =await agent.run(task="What do you remember about me?")
except HindsightError as e:
print(f"Memory operation failed: {e}")

Bank Lifecycle​

Create banks before first use and clean up when done:

asyncdefmain():
client = Hindsight(base_url="http://localhost:8888")

# Create bank (idempotent)
await client.acreate_bank(bank_id="user-123")

tools = create_hindsight_tools(client=client, bank_id="user-123")
# ... use tools ...

# Optional: delete bank when no longer needed
await client.adelete_bank(bank_id="user-123")

Multi-Agent Teams​

Give each agent its own memory bank, or share a bank across a team:

# Per-agent memory
researcher_tools = create_hindsight_tools(client=client, bank_id="researcher-memory")
writer_tools = create_hindsight_tools(client=client, bank_id="writer-memory")

# Shared team memory
shared_tools = create_hindsight_tools(
client=client,
bank_id="team-shared",
tags=["team:content"],
)

API Reference​

create_hindsight_tools()​

ParameterDefaultDescription
bank_idrequiredHindsight memory bank ID
clientNonePre-configured Hindsight client
hindsight_api_urlNoneAPI URL (used if no client provided)
api_keyNoneAPI key (used if no client provided)
budget"mid"Recall/reflect budget level (low/mid/high)
max_tokens4096Maximum tokens for recall results
tagsNoneTags applied when storing memories
recall_tagsNoneTags to filter when searching
recall_tags_match"any"Tag matching mode (any/all/any_strict/all_strict)
retain_metadataNoneDefault metadata dict for retain operations
retain_document_idNoneDefault document_id for retain (groups/upserts memories)
recall_typesNoneFact types to filter (world, experience, observation)
recall_include_entitiesFalseInclude entity information in recall results
reflect_contextNoneAdditional context for reflect operations
reflect_max_tokensNoneMax tokens for reflect results (defaults to max_tokens)
reflect_response_schemaNoneJSON schema to constrain reflect output format
reflect_tagsNoneTags to filter memories used in reflect (defaults to recall_tags)
reflect_tags_matchNoneTag matching for reflect (defaults to recall_tags_match)
include_retainTrueInclude the retain (store) tool
include_recallTrueInclude the recall (search) tool
include_reflectTrueInclude the reflect (synthesize) tool

configure()​

ParameterDefaultDescription
hindsight_api_urlProduction APIHindsight API URL
api_keyHINDSIGHT_API_KEY envAPI key for authentication
budget"mid"Default recall budget level
max_tokens4096Default max tokens for recall
tagsNoneDefault tags for retain operations
recall_tagsNoneDefault tags to filter recall
recall_tags_match"any"Default tag matching mode

Requirements​

  • Python >= 3.10
  • autogen-core >= 0.4.0
  • hindsight-client >= 0.4.0

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /