Per-tool TTL: dedup.check() with a ttl_seconds argument that allows the same call after N seconds. Useful for tools that return changing data where a repeated call after 60 seconds is legitimate but a repeated call in the same 5-second window is a loop.
Soft mode: instead of raising DuplicateToolCall, return a bool so callers can decide whether to skip silently or surface the duplicate to the model. Some agent architectures prefer returning the cached result with no exception path.
Stats: dedup.stats() returning total calls, duplicate count, and per-tool breakdown. Useful for identifying which tools trigger the most cycles.
Built as part of the agent-stack family: composable Python primitives for production LLM agents.