total_duration: How long the response took to generateload_duration: How long the model took to loadprompt_eval_count: How many input tokens were processedprompt_eval_duration: How long it took to evaluate the prompteval_count: How many output tokens were processeseval_duration: How long it took to generate the output tokens
Example response
For endpoints that return usage metrics, the response body will include the usage fields. For example, a non-streaming call to/api/generate may return the following response:
{
"model": "gemma4",
"created_at": "2025年10月17日T23:14:07.414671Z",
"response": "Hello! How can I help you today?",
"done": true,
"done_reason": "stop",
"total_duration": 174560334,
"load_duration": 101397084,
"prompt_eval_count": 11,
"prompt_eval_duration": 13074791,
"eval_count": 18,
"eval_duration": 52479709
}
done is true.