Pricing

You only pay for what you use on Replicate. Some models are billed by hardware and time, others by input and output.

Public models

Thousands of open-source machine learning models have been contributed by our community and more are added every day. We also host a wide variety of proprietary models.

Most models are billed by the time they take to run. The price-per-second varies according to the hardware in use. When running or training one of these public models, you only pay for the time it takes to process your request.

Some models are billed by input and output. We've included some examples below.

You'll find estimates for how much any model will cost you on the model's page.

anthropic/claude-3.7-sonnet

The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)

0ドル.015 / thousand output tokens

3ドル.00 / million input tokens

black-forest-labs/flux-1.1-pro

Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.

0ドル.04 / output image

black-forest-labs/flux-dev

A 12 billion parameter rectified flow transformer capable of generating images from text descriptions

0ドル.025 / output image

black-forest-labs/flux-schnell

The fastest image generation model tailored for local development and personal use

3ドル.00 / thousand output images

deepseek-ai/deepseek-r1

A reasoning model trained with reinforcement learning, on par with OpenAI o1

0ドル.01 / thousand output tokens

3ドル.75 / million input tokens

ideogram-ai/ideogram-v3-quality

The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles

0ドル.09 / output image

recraft-ai/recraft-v3

Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis

0ドル.04 / output image

wavespeedai/wan-2.1-i2v-480p

Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

0ドル.09 / second of output video

wavespeedai/wan-2.1-i2v-720p

Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

0ドル.25 / second of output video

Private models

You aren't limited to the public models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.

Unlike public models, most private models (with the exception of fast booting fine-tunes) run on dedicated hardware so you don't have to share a queue with anyone else. This means you pay for all the time instances of the model are online: the time they spend setting up; the time they spend idle, waiting for requests; and the time they spend active, processing your requests. If you get a ton of traffic, we automatically scale up and down to handle the demand.

For fast booting fine-tunes you'll only be billed for the time the model is active and processing your requests, so you won't pay for idle time like with other private models. Fast booting fine-tunes are labeled as such in the model's version list.

Hardware pricing

CPU (Small)
cpu-small

0ドル.000025/sec
0ドル.09/hr

GPU: -

CPU: 1x

GPU RAM: -

RAM: 2GB

CPU
cpu

0ドル.000100/sec
0ドル.36/hr

GPU: -

CPU: 4x

GPU RAM: -

RAM: 8GB

Nvidia A100 (80GB) GPU
gpu-a100-large

0ドル.001400/sec
5ドル.04/hr

GPU: 1x

CPU: 10x

GPU RAM: 80GB

RAM: 144GB

2x Nvidia A100 (80GB) GPU
gpu-a100-large-2x

0ドル.002800/sec
10ドル.08/hr

GPU: 2x

CPU: 20x

GPU RAM: 160GB

RAM: 288GB

Nvidia H100 GPU
gpu-h100

0ドル.001525/sec
5ドル.49/hr

GPU: 1x

CPU: 13x

GPU RAM: 80GB

RAM: 72GB

Nvidia L40S GPU
gpu-l40s

0ドル.000975/sec
3ドル.51/hr

GPU: 1x

CPU: 10x

GPU RAM: 48GB

RAM: 65GB

2x Nvidia L40S GPU
gpu-l40s-2x

0ドル.001950/sec
7ドル.02/hr

GPU: 2x

CPU: 20x

GPU RAM: 96GB

RAM: 144GB

Nvidia T4 GPU
gpu-t4

0ドル.000225/sec
0ドル.81/hr

GPU: 1x

CPU: 4x

GPU RAM: 16GB

RAM: 16GB

Additional hardware

4x Nvidia A100 (80GB) GPU
gpu-a100-large-4x

0ドル.005600/sec
20ドル.16/hr

Additional Multi-GPU A100 capacity is available with committed spend contracts.

8x Nvidia A100 (80GB) GPU
gpu-a100-large-8x

0ドル.011200/sec
40ドル.32/hr

Additional Multi-GPU A100 capacity is available with committed spend contracts.

2x Nvidia H100 GPU
gpu-h100-2x

0ドル.003050/sec
10ドル.98/hr

Additional Multi-GPU H100 capacity is available with committed spend contracts.

4x Nvidia H100 GPU
gpu-h100-4x

0ドル.006100/sec
21ドル.96/hr

Additional Multi-GPU H100 capacity is available with committed spend contracts.

8x Nvidia H100 GPU
gpu-h100-8x

0ドル.012200/sec
43ドル.92/hr

Additional Multi-GPU H100 capacity is available with committed spend contracts.

4x Nvidia L40S GPU
gpu-l40s-4x

0ドル.003900/sec
14ドル.04/hr

Additional Multi-GPU L40S capacity is available with committed spend contracts.

8x Nvidia L40S GPU
gpu-l40s-8x

0ドル.007800/sec
28ドル.08/hr

Additional Multi-GPU L40S capacity is available with committed spend contracts.

Hardware	Price	GPU	CPU	GPU RAM	RAM
CPU (Small) cpu-small	0ドル.000025/sec 0ドル.09/hr	-	1x	-	2GB
CPU cpu	0ドル.000100/sec 0ドル.36/hr	-	4x	-	8GB
Nvidia A100 (80GB) GPU gpu-a100-large	0ドル.001400/sec 5ドル.04/hr	1x	10x	80GB	144GB
2x Nvidia A100 (80GB) GPU gpu-a100-large-2x	0ドル.002800/sec 10ドル.08/hr	2x	20x	160GB	288GB
Nvidia H100 GPU gpu-h100	0ドル.001525/sec 5ドル.49/hr	1x	13x	80GB	72GB
Nvidia L40S GPU gpu-l40s	0ドル.000975/sec 3ドル.51/hr	1x	10x	48GB	65GB
2x Nvidia L40S GPU gpu-l40s-2x	0ドル.001950/sec 7ドル.02/hr	2x	20x	96GB	144GB
Nvidia T4 GPU gpu-t4	0ドル.000225/sec 0ドル.81/hr	1x	4x	16GB	16GB
Additional hardware
4x Nvidia A100 (80GB) GPU gpu-a100-large-4x	0ドル.005600/sec 20ドル.16/hr	Additional Multi-GPU A100 capacity is available with committed spend contracts.
8x Nvidia A100 (80GB) GPU gpu-a100-large-8x	0ドル.011200/sec 40ドル.32/hr	Additional Multi-GPU A100 capacity is available with committed spend contracts.
2x Nvidia H100 GPU gpu-h100-2x	0ドル.003050/sec 10ドル.98/hr	Additional Multi-GPU H100 capacity is available with committed spend contracts.
4x Nvidia H100 GPU gpu-h100-4x	0ドル.006100/sec 21ドル.96/hr	Additional Multi-GPU H100 capacity is available with committed spend contracts.
8x Nvidia H100 GPU gpu-h100-8x	0ドル.012200/sec 43ドル.92/hr	Additional Multi-GPU H100 capacity is available with committed spend contracts.
4x Nvidia L40S GPU gpu-l40s-4x	0ドル.003900/sec 14ドル.04/hr	Additional Multi-GPU L40S capacity is available with committed spend contracts.
8x Nvidia L40S GPU gpu-l40s-8x	0ドル.007800/sec 28ドル.08/hr	Additional Multi-GPU L40S capacity is available with committed spend contracts.

Learn more

For a deeper dive, check out how billing works on Replicate.

Enterprise & volume discounts

If you need more support or have complex requirements, we can offer:

Dedicated account manager
Priority support
Higher GPU limits
Performance SLAs
Help with onboarding, custom models, and optimizations

We've also got volume discounts for large amounts of spend. Visit enterprise to learn more.