Skip to main content
Hindsight is State-of-the-Art on Memory for AI Agents | Read the paper β†’
πŸ€–
Using a coding agent? Run this to install the Hindsight docs skill:
npx skills add https://github.com/vectorize-io/hindsight --skill hindsight-docs

MCP Server

Hindsight includes a built-in Model Context Protocol (MCP) server that allows AI assistants to store and retrieve memories directly.

Access​

The MCP server is enabled by default and mounted at /mcp on the API server. Each memory bank has its own MCP endpoint:

http://localhost:8888/mcp/{bank_id}/

For example, to connect to the memory bank alice:

http://localhost:8888/mcp/alice/

To disable the MCP server, set the environment variable:

exportHINDSIGHT_API_MCP_ENABLED=false

Authentication​

By default, the MCP endpoint is open (no authentication required).

To enable authentication, configure the API key tenant extension:

exportHINDSIGHT_API_TENANT_EXTENSION=hindsight_api.extensions.builtin.tenant:ApiKeyTenantExtension
exportHINDSIGHT_API_TENANT_API_KEY=your-secret-key

When authentication is enabled, include your API key in the Authorization header:

Claude Code​

claude mcp add--transport http hindsight http://localhost:8888/mcp \
--header"Authorization: Bearer your-secret-key"\
--header"X-Bank-Id: my-bank"

Claude Desktop​

Add to ~/.claude_desktop_config.json:

{
"mcpServers":{
"hindsight":{
"url":"http://localhost:8888/mcp",
"headers":{
"Authorization":"Bearer your-secret-key",
"X-Bank-Id":"my-bank"
}
}
}
}

Direct HTTP Request​

curl-X POST http://localhost:8888/mcp \
-H"Authorization: Bearer your-secret-key"\
-H"X-Bank-Id: my-bank"\
-H"Content-Type: application/json"\
-H"Accept: application/json, text/event-stream"\
-d'{"jsonrpc": "2.0", "method": "tools/list", "id": 1}'

If the key is missing or invalid, requests will receive a 401 Unauthorized response.

Bank Selection​

The memory bank is resolved in this priority order:

  1. URL path (highest priority): http://localhost:8888/mcp/my-bank/
  2. X-Bank-Id header: --header "X-Bank-Id: my-bank"
  3. Default: Uses HINDSIGHT_MCP_BANK_ID env var (default: "default")

Per-Bank Endpoints​

Unlike traditional MCP servers where tools require explicit identifiers, Hindsight uses per-bank endpoints. The bank_id is part of the URL path, so tools don't need to specify which bank to useβ€”it's implicit from the connection.

This design:

  • Simplifies tool usage β€” no need to pass bank_id with every call
  • Enforces isolation β€” each MCP connection is scoped to a single bank
  • Enables multi-tenant setups β€” connect different users to different endpoints

Two Modes​

The MCP server operates in two modes depending on the URL:

ModeURLToolsbank_id
Single-bank/mcp/{bank_id}/27 tools (memory, mental models, directives, documents, operations, tags, bank management)Implicit from URL
Multi-bank/mcp/All 30 tools including list_banks, create_bank, get_bank_statsExplicit bank_id parameter on each tool

Single-bank mode (recommended) scopes all operations to the bank in the URL. Tools don't expose a bank_id parameter.

Multi-bank mode exposes all tools with an optional bank_id parameter, plus bank management tools (list_banks, create_bank, get_bank_stats).

Tool Metadata and Instructions​

Hindsight can append deployment-specific guidance to the retain and recall MCP tool descriptions. Set HINDSIGHT_API_MCP_INSTRUCTIONS on the API server when clients should see local rules, such as which tags to use or which memories should be retained.

exportHINDSIGHT_API_MCP_INSTRUCTIONS="Use project:<name> tags for project-specific memories."

MCP clients that read tool annotations also receive safety hints from the built-in tools:

  • Read-only operations such as recall, reflect, list_*, and get_* are marked with readOnlyHint: true.
  • Delete, clear, and invalidate operations are marked with destructiveHint: true.
  • openWorldHint is false for the built-in tools because Hindsight operates on its configured memory store rather than the open internet.
  • Write operations such as retain, create_*, update_*, refresh_mental_model, and cancel_operation are not marked destructive.

Available Tools​

retain​

Store information to long-term memory.

ParameterTypeRequiredDescription
contentstringYesThe fact or memory to store
contextstringNoCategory for the memory (default: general)
timestampstringNoISO 8601 timestamp for when the event occurred
tagslist[string]NoTags for organizing and filtering this memory
metadataobjectNoKey-value metadata to attach (e.g., {"source": "slack"})
document_idstringNoAssociate this memory with an existing document

Example:

{
"name":"retain",
"arguments":{
"content":"User prefers Python over JavaScript for backend development",
"context":"programming_preferences",
"tags":["user:alice","preferences"]
}
}

When to use:

  • User shares personal facts, preferences, or interests
  • Important events or milestones are mentioned
  • Decisions, opinions, or goals are stated
  • Work context or project details are discussed

sync_retain​

Store information to long-term memory and wait for completion. Unlike retain (which is asynchronous), sync_retain blocks until the memory is fully stored and immediately available for recall β€” useful for read-after-write flows where you query right after storing.

ParameterTypeRequiredDescription
contentstringYesThe fact or memory to store
contextstringNoCategory for the memory (default: general)
timestampstringNoISO 8601 timestamp for when the event occurred
tagslist[string]NoTags for organizing and filtering this memory
metadataobjectNoKey-value metadata to attach (e.g., {"source": "slack"})
document_idstringNoAssociate this memory with an existing document

Example:

{
"name":"sync_retain",
"arguments":{
"content":"User prefers Python over JavaScript for backend development",
"context":"programming_preferences",
"tags":["user:alice","preferences"]
}
}

When to use:

  • You need the memory queryable immediately after storing (read-after-write)
  • A workflow step depends on the stored memory being available before continuing
  • Otherwise prefer retain (asynchronous) to avoid blocking on storage

recall​

Search memories to provide personalized responses.

ParameterTypeRequiredDescription
querystringYesNatural language search query
max_tokensintegerNoMaximum tokens to return (default: 4096)
budgetstringNoSearch thoroughness: low, mid, or high (default: high)
typeslist[string]NoFilter by fact type: world, experience, observation. Defaults to all
tagslist[string]NoFilter memories by tags
tags_matchstringNoTag matching mode: any (default) or all
query_timestampstringNoISO 8601 timestamp β€” recall as if asking at this point in time; anchors relative temporal expressions and recency scoring
min_scoresobjectNoOptional per-stage score floors, e.g. {"reranker": 0.5}. Keys: semantic/keyword (retrieval-level cutoffs), reranker/final (post-ranking). All inclusive and AND-ed; omit for no filtering. Reranker scores aren't calibrated across queries β€” calibrate before use

Example:

{
"name":"recall",
"arguments":{
"query":"What are the user's programming language preferences?",
"tags":["preferences"],
"budget":"high"
}
}

When to use:

  • Start of conversation to recall relevant context
  • Before making recommendations
  • When user asks about something they may have mentioned before
  • To provide continuity across conversations

reflect​

Generate thoughtful analysis by synthesizing stored memories with the bank's personality.

ParameterTypeRequiredDescription
querystringYesThe question or topic to reflect on
contextstringNoOptional context about why this reflection is needed
budgetstringNoSearch budget: low, mid, or high (default: low)
max_tokensintegerNoMaximum tokens in the response (default: 4096)
response_schemaobjectNoJSON Schema for structured output. When provided, the response includes a structured_output field
tagslist[string]NoFilter memories by tags before reflecting
tags_matchstringNoTag matching mode: any (default) or all
include_tracebooleanNoInclude tool_trace and llm_trace debugging output. Defaults to false to keep responses small

Example:

{
"name":"reflect",
"arguments":{
"query":"Based on my past decisions, what architectural style do I prefer?",
"budget":"mid",
"tags":["architecture"]
}
}

When to use:

  • When reasoned analysis is needed, not just fact retrieval
  • Questions like "What should I do?" rather than "What did I say?"
  • Synthesizing patterns across multiple memories

create_mental_model​

Create a mental model β€” a living document that stays current with your memories. Mental models are pre-computed reflections that get automatically refreshed as new memories are stored.

ParameterTypeRequiredDescription
namestringYesHuman-readable name for the mental model
source_querystringYesThe query used to generate and refresh the model
mental_model_idstringNoCustom ID (alphanumeric lowercase with hyphens). Auto-generated if not provided
tagslist[string]NoTags for organizing and filtering models
max_tokensintegerNoMaximum tokens for model content (default: 2048)
trigger_refresh_after_consolidationbooleanNoAuto-refresh this model after memory consolidation (default: false)

Example:

{
"name":"create_mental_model",
"arguments":{
"name":"Team Directory",
"source_query":"Who works here and what do they do?",
"tags":["team","people"]
}
}

Content generation runs asynchronously. The response includes an operation_id to track progress.


list_mental_models​

List all mental models in a bank, optionally filtered by tags.

ParameterTypeRequiredDescription
tagslist[string]NoFilter models by tags

get_mental_model​

Retrieve a specific mental model by ID, including its full content.

ParameterTypeRequiredDescription
mental_model_idstringYesThe ID of the mental model to retrieve

update_mental_model​

Update a mental model's metadata or settings.

ParameterTypeRequiredDescription
mental_model_idstringYesThe ID of the mental model to update
namestringNoNew name
source_querystringNoNew source query
tagslist[string]NoNew tags
max_tokensintegerNoNew max tokens
trigger_refresh_after_consolidationbooleanNoAuto-refresh after consolidation. Only set when you want to change this setting

delete_mental_model​

Permanently delete a mental model.

ParameterTypeRequiredDescription
mental_model_idstringYesThe ID of the mental model to delete

refresh_mental_model​

Re-generate a mental model's content from the latest memories. Runs asynchronously.

ParameterTypeRequiredDescription
mental_model_idstringYesThe ID of the mental model to refresh

clear_mental_model​

Clear a mental model's content while keeping its definition. After clearing, call refresh_mental_model to rebuild it from the latest memories.

ParameterTypeRequiredDescription
mental_model_idstringYesThe ID of the mental model to clear

list_banks (multi-bank mode only)​

List all available memory banks.


create_bank (multi-bank mode only)​

Create a new memory bank or retrieve an existing one.

ParameterTypeRequiredDescription
bank_idstringYesThe ID for the new bank
namestringNoHuman-friendly name for the bank
missionstringNoMission describing who the agent is and what they're trying to accomplish

list_directives​

List all directives in a bank. Directives are instructions that guide how the memory system processes and responds to queries.

ParameterTypeRequiredDescription
tagslist[string]NoFilter directives by tags
active_onlybooleanNoOnly return active directives (default: true)

create_directive​

Create a new directive in a bank.

ParameterTypeRequiredDescription
namestringYesHuman-readable name for the directive
contentstringYesThe directive content/instruction
priorityintegerNoPriority level (higher = more important)
is_activebooleanNoWhether the directive is active (default: true)
tagslist[string]NoTags for organizing directives

delete_directive​

Delete a directive by ID.

ParameterTypeRequiredDescription
directive_idstringYesThe ID of the directive to delete

list_memories​

Browse stored memories with optional filtering and pagination.

ParameterTypeRequiredDescription
typestringNoFilter by fact type: world, experience, or observation
qstringNoSearch query to filter memories
limitintegerNoMaximum number of results (default: 100)
offsetintegerNoNumber of results to skip for pagination (default: 0)

get_memory​

Retrieve a specific memory by ID.

ParameterTypeRequiredDescription
memory_idstringYesThe ID of the memory to retrieve

list_documents​

List documents that have been ingested into the memory bank.

ParameterTypeRequiredDescription
qstringNoSearch query to filter documents
limitintegerNoMaximum number of results (default: 100)

get_document​

Retrieve a specific document by ID, including its metadata.

ParameterTypeRequiredDescription
document_idstringYesThe ID of the document to retrieve

delete_document​

Delete a document and all memories linked to it.

ParameterTypeRequiredDescription
document_idstringYesThe ID of the document to delete

list_operations​

List async operations (retain processing, mental model refresh, etc.) with optional status filtering.

ParameterTypeRequiredDescription
statusstringNoFilter by status: pending, running, completed, failed, cancelled
limitintegerNoMaximum number of results (default: 100)

get_operation​

Get the status and details of an async operation.

ParameterTypeRequiredDescription
operation_idstringYesThe ID of the operation to check

cancel_operation​

Cancel a pending or running async operation.

ParameterTypeRequiredDescription
operation_idstringYesThe ID of the operation to cancel

list_tags​

List all unique tags used in a bank, optionally filtered by pattern.

ParameterTypeRequiredDescription
qstringNoGlob pattern to filter tags (e.g., project:*)
limitintegerNoMaximum number of results (default: 100)

get_bank​

Get information about a memory bank, including its name, mission, and disposition.


get_bank_stats (multi-bank mode only)​

Get statistics for a memory bank (node/link counts).


update_bank​

Update a memory bank's configuration. Updates the bank's name and/or any bank-level configuration fields β€” only provided fields are updated; omitted fields remain unchanged.

ParameterTypeRequiredDescription
namestringNoHuman-friendly display name for the bank
missionstringNoDeprecated β€” alias for config_updates.reflect_mission
config_updatesobjectNoDictionary of configuration fields to update. Supports all bank-configurable fields (see below). Non-configurable or credential fields are rejected

The config_updates object accepts any bank-configurable field by its Python field name, including:

  • reflect_mission β€” mission/context for Reflect operations
  • retain_mission β€” steers what gets extracted during retain()
  • retain_extraction_mode β€” concise (default), verbose, or custom
  • retain_custom_instructions β€” custom extraction prompt (active when mode is custom)
  • retain_chunk_size β€” target maximum characters for each content chunk
  • retain_structured_chunk_size β€” maximum characters for a single JSONL line or conversation turn to keep whole
  • retain_chunk_batch_size β€” number of chunks to process in parallel
  • enable_observations β€” toggle observation consolidation after retain()
  • observations_mission β€” controls observation synthesis rules
  • disposition_skepticism β€” critical evaluation level (1–5)
  • disposition_literalism β€” literal vs. abstract interpretation (1–5)
  • disposition_empathy β€” emotional context consideration (1–5)
  • entity_labels β€” controlled vocabulary for entity classification
  • entities_allow_free_form β€” allow labels outside entity_labels
  • recall_include_chunks β€” include raw chunks in recall results
  • recall_max_tokens β€” max tokens for recall results
  • mcp_enabled_tools β€” tool allowlist for this bank

delete_bank​

Permanently delete a memory bank and all its data (memories, documents, entities, mental models).


clear_memories​

Clear all memories from a bank without deleting the bank itself. Optionally filter by fact type to only clear specific kinds of memories.

ParameterTypeRequiredDescription
typestringNoFact type to clear: world, experience, or observation. If not specified, clears all

Integration with AI Assistants​

The MCP server can be used with any MCP-compatible AI assistant. See the Authentication section above for Claude Code and Claude Desktop configuration examples.

Each user can have their own configuration pointing to their personal memory bank using either:

  • A bank-specific URL path like /mcp/alice/ (recommended)
  • The X-Bank-Id header

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /