Releases: simonw/llm

0.32a3

09 Jun 22:27

@simonw simonw

1e72f0a

0.32a3 Pre-release

Pre-release

Driven by the needs of Datasette Agent's human-in-the-loop ask_user() feature, made the following improvements to how tool calls work:

Tool implementations can declare a parameter named llm_tool_call in order to be passed the llm.ToolCall object for the current invocation. This allows them to access the current llm_tool_call.tool_call_id. See Accessing the tool call from inside a tool. #1480
Every tool call is now guaranteed a unique tool_call_id - providers that do not supply one get a synthesized tc_-prefixed ULID. #1481
Tools can raise a llm.PauseChain exception to cleanly pause the tool chain, useful for things like waiting for human approval. The exception propagates to the caller with .tool_call and .tool_results (completed sibling results) attached, and no model call is made with a placeholder result. See Pausing a chain from inside a tool. #1482
Failure semantics for concurrent tool execution: async sibling tool calls always run to completion before a pause or hook exception propagates. #1482
Chains can now resume from a messages= history ending in unresolved tool calls: the calls are executed through the normal before_call/after_call machinery before the first model call, skipping any that already have results. The execute_tool_calls() method also accepts a new optional tool_calls_list= argument for executing an explicit list of ToolCall objects in place of the calls requested by the response. See Resuming a chain with pending tool calls. #1482
Fixed a bug where the async tool executor silently dropped calls to tools not present in tools= - these now return Error: tool "..." does not exist results, matching the sync executor. #1483

Assets 2

1 person reacted

0.32a2

12 May 17:45

@simonw simonw

0.32a2

8aba606

0.32a2 Pre-release

Pre-release

Support for the OpenAI Responses API

Most reasoning-capable OpenAI models now use the /v1/responses endpoint instead of /v1/chat/completions. This enables interleaved reasoning across tool calls for GPT-5 class models. #1435

New Responses and AsyncResponses model classes driving the OpenAI Responses API. The existing Chat and AsyncChat classes are unchanged so other plugins that import them keep working.
The following models now use the Responses API by default: o1, o3-mini, o3, o4-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.2, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.5 (and their pinned date variants).
Use -o chat_completions 1 to fall back to the older /v1/chat/completions code path for any of these models.
Encrypted reasoning items are captured as provider_metadata on ReasoningPart objects and round-tripped back to OpenAI on subsequent turns.
Reasoning summaries are now requested with "summary": "auto" so visible reasoning text is streamed back where the model produces it, unless --hide-reasoning or hide_reasoning= is set.
This means OpenAI prompts run using llm prompt that return reasoning tokens will display those on standard error.

CLI

New llm -m model --options flag to list the options supported by a given model. #1441
The -R/--no-reasoning option has been renamed to -R/--hide-reasoning.

Python API

New hide_reasoning=True keyword argument on model.prompt(), conversation.prompt(), model.chain(), conversation.chain(), and their async counterparts, exposed to model plugins as prompt.hide_reasoning. Model plugins can use this to decide if they should request visible reasoning summaries from their providers. #1442
New options= dict keyword argument on Model.prompt(), Conversation.prompt(), Response.reply(), and their async equivalents, matching the pattern already used by .chain(). The previous **kwargs form continues to work for backwards compatibility but is no longer documented, and will be removed in the future. #1432

Bug fixes

add_tool_call() calls that were not also recorded as stream events are now correctly emitted as ToolCallPart objects when assembling response parts, so they survive serialization via response.to_dict(). #1433

Assets 2

1 person reacted

0.32a1

29 Apr 23:52

@simonw simonw

0.32a1

9a5c24e

0.32a1 Pre-release

Pre-release

Fixed a bug in 0.32a0 where tool-calling conversations were not correctly reinflated from SQLite. #1426

Assets 2

0.32a0

29 Apr 18:57

@simonw simonw

0.32a0

35c35da

0.32a0 Pre-release

Pre-release

This alpha introduces a major backwards-compatible refactor. Models can now be prompted with a list of messages, OpenAI Chat Completions style, and the response can now be iterated over as a sequence of mixed types of content, for example reasoning tokens mixed with text tokens mixed with tool calls.

For more background on this release take a look at the annotated release notes on my blog.

Prompt inputs and response outputs are now expressed as a list of Message objects, each containing typed Part objects (text, reasoning, tool calls, tool results, attachments).

The llm CLI tool can now display reasoning tokens while executing a prompt.

Plugin authors should read the expanded Advanced model plugins documentation, which now covers StreamEvent, consuming prompt.messages, and round-tripping opaque provider metadata such as Anthropic extended-thinking signatures and Gemini thoughtSignature values.

Structured messages and streaming events

New llm.Message value type and constructor helpers llm.user(), llm.assistant(), llm.system(), and llm.tool_message() for building structured prompt inputs. The helpers accept strings, Attachment instances, or nested Part lists.
New messages= keyword argument on model.prompt(), conversation.prompt(), model.chain(), conversation.chain(), and their async counterparts. The prompt=, system=, attachments=, and tool_results= keywords still work and synthesize into the same Message list internally.
New response.stream_events() and response.astream_events() methods yielding typed StreamEvent objects (type is one of "text", "reasoning", "tool_call_name", "tool_call_args", "tool_result", plus a redacted=True marker for opaque reasoning). Iterating against response directly continues to yield only text strings.
New response.messages() method (async: await response.messages()) returning the assembled list[Message] produced by the model. Calling it forces execution if the response prompt has not yet been executed.
New response.reply(prompt=None, **kwargs) method that continues the conversation from any Response, regardless of origin. When the previous response made tool calls and tool_results= was not passed, reply() automatically executes the pending tool calls and threads the results into the next turn. On async responses reply() is awaitable.
New response.to_dict() and Response.from_dict(data, *, model=None) for JSON-safe serialization of a full conversation turn --- model id, input chain, assembled output (including reasoning parts and provider metadata), options, and audit fields. Reasoning signatures and thoughtSignature values round-trip via provider_metadata, so multi-turn extended thinking works across process boundaries.
New llm/serialization.py module exposing MessageDict, PartDict, ResponseDict, PromptDict, UsageDict, AttachmentDict, and the per-Part TypedDicts. Every to_dict() / from_dict() method is annotated with the matching TypedDict.
Response.prompt.messages is now the canonical structured input across the entire conversation chain. Conversation.prompt and AsyncConversation.prompt pre-compute the full chain (prior input + prior output + new turn) before constructing the next Prompt, so response.prompt.messages is always exactly what the model was sent.

CLI

llm prompt and llm chat now display visible reasoning text to stderr in a dim style while the response streams.
New -R/--no-reasoning flag for llm prompt and llm chat to suppress the reasoning stream.
llm logs now renders any visible reasoning emitted during a response under a ## Reasoning heading above the response.
New reasoning column on the responses table populated from the visible-reasoning text.

Assets 2

1 person reacted

0.31

24 Apr 23:35

@simonw simonw

0.31

5ce40fd

0.31 Latest

Latest

New GPT-5.5 OpenAI model: llm -m gpt-5.5. #1418
New option to set the text verbosity level for GPT-5+ OpenAI models: -o verbosity low. Values are low, medium, high.
New option for setting the image detail level used for image attachments to OpenAI models: -o image_detail low - values are low, high and auto, and GPT-5.4 and 5.5 also accept original.
Models listed in extra-openai-models.yaml are now also registered as asynchronous. #1395

Assets 2

1 person reacted

0.30

31 Mar 20:35

@simonw simonw

0.30

7169fe9

0.30

The register_models() plugin hook now takes an optional model_aliases parameter listing all of the models, async models and aliases that have been registered so far by other plugins. A plugin with @hookimpl(trylast=True) can use this to take previously registered models into account. #1389
Added docstrings to public classes and methods and included those directly in the documentation.

Assets 2

3 people reacted

0.29

17 Mar 19:24

@simonw simonw

0.29

c7cf7e5

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

0.29

The -t/--template option now works correctly with the -x/--extract and --xl/--extract-last flags.
llm logs now shows any additional model options in the Markdown output. #1322
New OpenAI models: gpt-5.4, gpt-5.4-mini, gpt-5.4-nano. #1376

Assets 2

5 people reacted

0.28

12 Dec 20:03

@simonw simonw

0.28

4766f47

0.28

New OpenAI models: gpt-5.1, gpt-5.1-chat-latest, gpt-5.2 and gpt-5.2-chat-latest. #1300, #1317
LLM now requires Python 3.10 or higher. Python 3.14 is now covered by the tests.
When fetching URLs as fragments using llm -f URL, the request now includes a custom user-agent header: llm/VERSION (https://llm.datasette.io/). #1309
Fixed a bug where fragments were not correctly registered with their source when using llm chat. Thanks, Giuseppe Rota. #1316
Fixed some file descriptor leak warnings. Thanks, Eric Bloch. #1313
Fixed a deprecation warning for asyncio.iscoroutinefunction.
Type annotations for the OpenAI Chat, AsyncChat and Completion execute() methods. Thanks, Arjan Mossel. #1315
The project now uses uv and dependency groups for development. See the updated contributing documentation. #1318

Assets 2

6 people reacted

0.27.1

12 Aug 05:15

@simonw simonw

0.27.1

921fae9

0.27.1

llm chat -t template now correctly loads any tools that are included in that template. #1239
Fixed a bug where llm -m gpt5 -o reasoning_effort minimal --save gm saved a template containing invalid YAML. #1237
Fixed a bug where running llm chat -t template could cause prompts to be duplicated. #1240
Less confusing error message if a requested toolbox class is unavailable. #1238

Assets 2

6 people reacted

0.27

11 Aug 21:31

@simonw simonw

0.27

e15e1ad

0.27

This release adds support for the new GPT-5 family of models from OpenAI. It also enhances tool calling in a number of ways, including allowing templates to bundle pre-configured tools.

New features

New models: gpt-5, gpt-5-mini and gpt-5-nano. #1229
LLM templates can now include a list of tools. These can be named tools from plugins or arbitrary Python function blocks, see Tools in templates. #1009
Tools can now return attachments, for models that support features such as image input. #1014
New methods on the Toolbox class: .add_tool(), .prepare() and .prepare_async(), described in Dynamic toolboxes. #1111
New model.conversation(before_call=x, after_call=y) attributes for registering callback functions to run before and after tool calls. See tool debugging hooks for details. #1088
Some model providers can serve different models from the same configured URL - llm-llama-server for example. Plugins for these providers can now record the resolved model ID of the model that was used to the LLM logs using the response.set_resolved_model(model_id) method. #1117
Raising llm.CancelToolCall now only cancels the current tool call, passing an error back to the model and allowing it to continue. #1148
New -l/--latest option for llm logs -q searchterm for searching logs ordered by date (most recent first) instead of the default relevance search. #1177

Bug fixes and documentation

The register_embedding_models hook is now documented. #1049
Show visible stack trace for llm templates show invalid-template-name. #1053
Handle invalid tool names more gracefully in llm chat. #1104
Add a Tool plugins section to the plugin directory. #1110
Error on register(Klass) if the passed class is not a subclass of Toolbox. #1114
Add -h for --help for all llm CLI commands. #1134
Add missing dataclasses to advanced model plugins docs. #1137
Fixed a bug where llm logs -T llm_version "version" --async incorrectly recorded just one single log entry when it should have recorded two. #1150
All extra OpenAI model keys in extra-openai-models.yaml are now documented. #1228

Assets 2

5 people reacted

Uh oh!

Releases: simonw/llm

0.32a3

Uh oh!

0.32a2

Support for the OpenAI Responses API

CLI

Python API

Bug fixes

Uh oh!

0.32a1

Uh oh!

0.32a0

Structured messages and streaming events

CLI

Uh oh!

0.31

Uh oh!

0.30

Uh oh!

0.29

Uh oh!

0.28

Uh oh!

0.27.1

Uh oh!

0.27

New features

Bug fixes and documentation

Uh oh!