Fix #2102: Add structured response token counting examples #2133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

jalateras wants to merge 1 commit into openai:main

from jalateras:add-structured-response-token-counting

Open

Fix #2102: Add structured response token counting examples #2133

jalateras wants to merge 1 commit into openai:main from jalateras:add-structured-response-token-counting

+54 −1

Conversation

@jalateras

Copy link

@jalateras jalateras commented Sep 12, 2025

Summary

This PR adds comprehensive examples for counting tokens when using structured responses with the response_format parameter, addressing the gap identified in issue #2102.

Changes

Added a new section (Section 8) to the tiktoken notebook that covers:

Token counting for JSON mode (response_format={"type": "json_object"})
Token counting for structured outputs with JSON schemas
Comparison of token usage across different response formats
Helper function num_tokens_for_structured_response() to calculate schema overhead

Implementation Details

The solution introduces:

A helper function that calculates base message tokens plus schema overhead
Practical examples showing real-world use cases (sentiment analysis, book information extraction)
Verification against actual OpenAI API responses to validate accuracy
Clear documentation of overhead estimates for different response formats

Testing

All code examples have been tested to ensure:

Functions correctly calculate token estimates
Examples execute without errors
Token estimates align reasonably with actual API usage

Impact

This enhancement helps developers:

Better estimate API costs when using structured outputs
Understand the token overhead of different response formats
Plan for context window management with JSON schemas
Make informed decisions about schema complexity vs token usage

Fixes #2102

@jalateras


 feat: add structured response token counting examples to tiktoken not...

ae4fe95

...ebook
Added comprehensive examples for counting tokens when using structured outputs:
- JSON mode token counting with minimal overhead estimation
- Structured output with JSON schema token counting
- Comparison of token usage across different response formats
- Helper function to calculate schema overhead for accurate cost estimation
This addresses the need for understanding token usage with response_format parameter,
helping developers better estimate costs and manage context windows when using
structured outputs.
Fixes openai#2102

Labels

None yet

1 participant

@jalateras

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix #2102: Add structured response token counting examples #2133

Are you sure you want to change the base?

Fix #2102: Add structured response token counting examples #2133

Conversation

@jalateras jalateras commented Sep 12, 2025

Summary

Changes

Implementation Details

Testing

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant