Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

startvibecoding/llmspeedtest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

2 Commits

Repository files navigation

LLMSpeed - LLM Token Speed Testing Tool

Test the speed of OpenAI Chat and Anthropic format models, measuring first token latency and tokens per second.

Features

  • Support OpenAI Chat and Anthropic streaming APIs
  • Measure first token latency (ms) and tokens per second
  • Detect invalid models early via content-type check
  • Support reasoning_content field for reasoning models
  • Export results to CSV

Config File Format (config.yaml)

test_prompt: "Please introduce yourself"
models:
 - base_url: "https://api.openai.com/v1"
 type: "openai-chat"
 models:
 - "gpt-4o"
 - "gpt-4o-mini"
 api_key: "sk-xxx"
 - base_url: "https://api.anthropic.com"
 type: "anthropic"
 models:
 - "claude-3-5-sonnet-20241022"
 api_key: "sk-ant-xxx"

Run

pip install openai anthropic pyyaml
python llmspeed.py

Use a custom config file:

python llmspeed.py my_config.yaml

Output

Results are saved to results.csv with the following columns:

Column Description
base_url API endpoint
type API type (openai-chat or anthropic)
model Model name
first_token_latency First token latency in ms (-1 if failed)
tokens_per_second Tokens per second (-1 if failed)

About

Test the speed of OpenAI Chat and Anthropic format models, measuring first token latency and tokens per second.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /