Name	Name	Last commit message	Last commit date
Latest commit History 16 Commits
.devcontainer	.devcontainer
.github	.github
docs	docs
src/toon_format	src/toon_format
tests	tests
.editorconfig	.editorconfig
.gitignore	.gitignore
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
PUBLISHING.md	PUBLISHING.md
README.md	README.md
pyproject.toml	pyproject.toml

TOON Format for Python

Tests Python Versions

⚠️ Beta Status (v0.9.x): This library is in active development and working towards spec compliance. Beta published to PyPI. API may change before 1.0.0 release.

Compact, human-readable serialization format for LLM contexts with 30-60% token reduction vs JSON. Combines YAML-like indentation with CSV-like tabular arrays. Working towards full compatibility with the official TOON specification.

Key Features: Minimal syntax • Tabular arrays for uniform data • Array length validation • Python 3.8+ • Comprehensive test coverage.

# Beta published to PyPI - install from source:
git clone https://github.com/toon-format/toon-python.git
cd toon-python
uv sync
# Or install directly from GitHub:
pip install git+https://github.com/toon-format/toon-python.git

Quick Start

from toon_format import encode, decode
# Simple object
encode({"name": "Alice", "age": 30})
# name: Alice
# age: 30
# Tabular array (uniform objects)
encode([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
# [2,]{id,name}:
# 1,Alice
# 2,Bob
# Decode back to Python
decode("items[2]: apple,banana")
# {'items': ['apple', 'banana']}

CLI Usage

# Auto-detect format by extension
toon input.json -o output.toon # Encode
toon data.toon -o output.json # Decode
echo '{"x": 1}' | toon - # Stdin/stdout
# Options
toon data.json --encode --delimiter "\t" --length-marker
toon data.toon --decode --no-strict --indent 4

Options: -e/--encode -d/--decode -o/--output --delimiter --indent --length-marker --no-strict

API Reference

`encode(value, options=None)` → `str`

encode({"id": 123}, {"delimiter": "\t", "indent": 4, "lengthMarker": "#"})

Options:

delimiter: "," (default), "\t", "|"
indent: Spaces per level (default: 2)
lengthMarker: "" (default) or "#" to prefix array lengths

`decode(input_str, options=None)` → `Any`

decode("id: 123", {"indent": 2, "strict": True})

Options:

indent: Expected indent size (default: 2)
strict: Validate syntax, lengths, delimiters (default: True)

Token Counting & Comparison

Measure token efficiency and compare formats:

from toon_format import estimate_savings, compare_formats, count_tokens
# Measure savings
data = {"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}
result = estimate_savings(data)
print(f"Saves {result['savings_percent']:.1f}% tokens") # Saves 42.3% tokens
# Visual comparison
print(compare_formats(data))
# Format Comparison
# ────────────────────────────────────────────────
# Format Tokens Size (chars)
# JSON 45 123
# TOON 28 85
# ────────────────────────────────────────────────
# Savings: 17 tokens (37.8%)
# Count tokens directly
toon_str = encode(data)
tokens = count_tokens(toon_str) # Uses tiktoken (gpt5/gpt5-mini)

Requires tiktoken: uv add tiktoken (benchmark features are optional)

Format Specification

Type	Example Input	TOON Output
Object	`{"name": "Alice", "age": 30}`	`name: Alice` `age: 30`
Primitive Array	`[1, 2, 3]`	`[3]: 1,2,3`
Tabular Array	`[{"id": 1, "name": "A"}, {"id": 2, "name": "B"}]`	`[2,]{id,name}:` `1,A` `2,B`
Mixed Array	`[{"x": 1}, 42, "hi"]`	`[3]:` `- x: 1` `- 42` `- hi`

Quoting: Only when necessary (empty, keywords, numeric strings, whitespace, structural chars, delimiters)

Type Normalization: Infinity/NaN/Functions → null • Decimal → float • datetime → ISO 8601 • -0 → 0

Development

# Setup (requires uv: https://docs.astral.sh/uv/)
git clone https://github.com/toon-format/toon-python.git
cd toon-python
uv sync
# Run tests (792 tests, 91% coverage, 85% enforced)
uv run pytest --cov=toon_format --cov-report=term
# Code quality
uv run ruff check src/ tests/ # Lint
uv run ruff format src/ tests/ # Format
uv run mypy src/ # Type check

CI/CD: GitHub Actions • Python 3.8-3.14 • Coverage enforcement • PR coverage comments

Project Status & Roadmap

Following semantic versioning towards 1.0.0:

v0.8.x - Initial code set, tests, documentation ✅
v0.9.x - Serializer improvements, spec compliance testing, publishing setup (current)
v1.0.0-rc.x - Release candidates for production readiness
v1.0.0 - First stable release with full spec compliance

See CONTRIBUTING.md for detailed guidelines.

Documentation

📘 Full Documentation - Complete guides and references
🔧 API Reference - Detailed function documentation
📋 Format Specification - TOON syntax and rules
🤖 LLM Integration - Best practices for LLM usage
📜 TOON Spec - Official specification
🐛 Issues - Bug reports and features
🤝 Contributing - Contribution guidelines

Contributors

License

MIT License – see LICENSE for details

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

toon-format/toon-python

Folders and files

Latest commit

History

Repository files navigation

TOON Format for Python

Quick Start

CLI Usage

API Reference

`encode(value, options=None)` → `str`

`decode(input_str, options=None)` → `Any`

Token Counting & Comparison

Format Specification

Development

Project Status & Roadmap

Documentation

Contributors

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages

Contributors 5

Uh oh!

Languages

License

toon-format/toon-python

Folders and files

Latest commit

History

Repository files navigation

TOON Format for Python

Quick Start

CLI Usage

API Reference

encode(value, options=None) → str

decode(input_str, options=None) → Any

Token Counting & Comparison

Format Specification

Development

Project Status & Roadmap

Documentation

Contributors

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Uh oh!

Languages

`encode(value, options=None)` → `str`

`decode(input_str, options=None)` → `Any`

Packages