Generate a chat message

POST

api

chat

curl http://localhost:11434/api/chat -d '{
 "model": "gemma4",
 "messages": [
 {
 "role": "user",
 "content": "why is the sky blue?"
 }
 ]
}'

{
 "model": "<string>",
 "created_at": "2023-11-07T05:31:56Z",
 "message": {
 "role": "assistant",
 "content": "<string>",
 "thinking": "<string>",
 "tool_calls": [
 {
 "function": {
 "name": "<string>",
 "description": "<string>",
 "arguments": {}
 }
 }
 ],
 "images": [
 "<string>"
 ]
 },
 "done": true,
 "done_reason": "<string>",
 "total_duration": 123,
 "load_duration": 123,
 "prompt_eval_count": 123,
 "prompt_eval_duration": 123,
 "eval_count": 123,
 "eval_duration": 123,
 "logprobs": [
 {
 "token": "<string>",
 "logprob": 123,
 "bytes": [
 123
 ],
 "top_logprobs": [
 {
 "token": "<string>",
 "logprob": 123,
 "bytes": [
 123
 ]
 }
 ]
 }
 ]
}

Body

application/json

model

string

required

Model name

messages

object[]

required

Chat history as an array of message objects (each with a role and content)

Show child attributes

tools

object[]

Optional list of function tools the model may call during the chat

Show child attributes

format

Format to return a response in. Can be json or a JSON schema

Available options:

json

options

object

Runtime options that control text generation

Show child attributes

stream

boolean

default:true

think

When true, returns separate thinking output in addition to content. Can be a boolean (true/false) or a string ("high", "medium", "low") for supported models.

keep_alive

Model keep-alive duration (for example 5m or 0 to unload immediately)

logprobs

boolean

Whether to return log probabilities of the output tokens

top_logprobs

integer

Number of most likely tokens to return at each token position when logprobs are enabled

Response

Chat response

model

string

Model name used to generate this message

created_at

string<date-time>

Timestamp of response creation (ISO 8601)

message

object

Show child attributes

done

boolean

Indicates whether the chat response has finished

done_reason

string

Reason the response finished

total_duration

integer

Total time spent generating in nanoseconds

load_duration

integer

Time spent loading the model in nanoseconds

prompt_eval_count

integer

Number of tokens in the prompt

prompt_eval_duration

integer

Time spent evaluating the prompt in nanoseconds

eval_count

integer

Number of tokens generated in the response

eval_duration

integer

Time spent generating tokens in nanoseconds

logprobs

object[]

Log probability information for the generated tokens when logprobs are enabled

Show child attributes

EmbedCreates vector embeddings representing the input text

curl http://localhost:11434/api/chat -d '{
 "model": "gemma4",
 "messages": [
 {
 "role": "user",
 "content": "why is the sky blue?"
 }
 ]
}'

{
 "model": "<string>",
 "created_at": "2023-11-07T05:31:56Z",
 "message": {
 "role": "assistant",
 "content": "<string>",
 "thinking": "<string>",
 "tool_calls": [
 {
 "function": {
 "name": "<string>",
 "description": "<string>",
 "arguments": {}
 }
 }
 ],
 "images": [
 "<string>"
 ]
 },
 "done": true,
 "done_reason": "<string>",
 "total_duration": 123,
 "load_duration": 123,
 "prompt_eval_count": 123,
 "prompt_eval_duration": 123,
 "eval_count": 123,
 "eval_duration": 123,
 "logprobs": [
 {
 "token": "<string>",
 "logprob": 123,
 "bytes": [
 123
 ],
 "top_logprobs": [
 {
 "token": "<string>",
 "logprob": 123,
 "bytes": [
 123
 ]
 }
 ]
 }
 ]
}