Python SDK 2.0.0 beta

Latest

@stainless-app stainless-app released this 23 Oct 16:34

v2.0.0-beta.1

6146b12

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

Replicate’s v2 Python SDK is now in public beta. 🎉

As always, the replicate package is published on PyPI, and you can install it with pip using the --pre flag:

pip install --pre replicate

What’s new?

This new version is a complete rewrite of the SDK, built in partnership with Stainless, the team that helps design and maintain official SDKs for companies like OpenAI, Anthropic, and Cloudflare.

Replicate's v2 Python SDK is generated dynamically from our public OpenAPI schema. This allows us to automate client code generation and provide a Python API with method names, type hints, and documentation that is perfectly consistent with our HTTP API.

Now that most of the client code is generated dynamically, all changes to Replicate’s HTTP API are automatically supported by the Python SDK. This means whenever we add a new operation (like the new search API) or improve our docs for an existing API (like predictions.create()), the changes are automatically published in a new release of the Python SDK.

Running models

We think running AI models should be as easy as installing and running a package from PyPI.

With this idea in mind, we designed a new replicate.use() method that lets you run models as Python functions:

# pip install --pre replicate
import replicate
claude = replicate.use("anthropic/claude-4.5-sonnet")
seedream = replicate.use("bytedance/seedream-4")
veo = replicate.use("google/veo-3-fast")
# Enhance a simple prompt
image_prompt = claude(prompt="bananas wearing cowboy hats", system_prompt="turn prompts into image prompts")
# Generate an image from the enhanced prompt
images = seedream(prompt=image_prompt)
# Generate a video from the image
video = veo(prompt="dancing bananas", image_input=images[0])
open(video)

The new .use() method also supports streaming output. Here’s an example showing how to consume output tokens from Claude Sonnet 4.5 while the model is running:

import replicate
claude = replicate.use("anthropic/claude-4.5-sonnet", streaming=True)
for chunk in claude(prompt="Write a haiku about streaming output."):
 print(str(chunk), end="")
 
# Bytes flow through the pipe
# Data chunks arrive in waves
# Code drinks from the stream

API design

Our new SDK was designed to be approachable for newcomers while also being feature-complete for power users. There are three levels of APIs built into the new SDK, varying from simple high-level abstractions to powerful low-level methods that you give you complete control:

🍰 High-level API

The v2 SDK provides a new replicate.use() method that make it easy to run models and get their output all at once or as a streaming response. The replicate.run() method is still supported so your applications will continue to work, but recommend using use() going forward.

🛠️ Mid-level API

The v2 SDK has methods for every single operation available in our public HTTP API, like search(), predictions.create() , and collections.list(). These more fine-grained methods are defined by our OpenAPI schema, and updated in lock-step with our API. Every new feature, bug fix, or documentation improvement in our API becomes available immediately in a new release of the Python SDK. See our HTTP API docs and Python SDK docs for reference.

The SDK now supports all of these API operations:

search - Search models, collections, and docs (beta)
predictions.create - Create a prediction
predictions.get - Get a prediction
predictions.list - List predictions
predictions.cancel - Cancel a prediction
models.create - Create a model
models.get - Get a model
models.list - List public models
models.update - Update metadata for a model
models.search - Search public models
models.delete - Delete a model
models.examples.list - List examples for a model
models.predictions.create - Create a prediction using an official model
models.readme.get - Get a model's README
models.versions.get - Get a model version
models.versions.list - List model versions
models.versions.delete - Delete a model version
collections.get - Get a collection of models
collections.list - List collections of models
deployments.create - Create a deployment
deployments.get - Get a deployment
deployments.list - List deployments
deployments.update - Update a deployment
deployments.delete - Delete a deployment
deployments.predictions.create - Create a prediction using a deployment
trainings.create - Create a training
trainings.get - Get a training
trainings.list - List trainings
trainings.cancel - Cancel a training
hardware.list - List available hardware for models
account.get - Get the authenticated account
webhooks.default.secret.get - Get the signing secret for the default webhook

🔬Low-level API

The v2 SDK includes generic request methods like replicate.get() and replicate.post() for making custom API requests with full control over the request and response. This is useful for testing undocumented APIs, setting custom headers, or getting lower-level access to response objects.

New SDK features

In addition to the new API design, there are loads of new features in the v2 SDK:

Type hints: Typed requests and responses provide autocomplete and documentation within your editor.
Pagination: All list methods are paginated, and the SDK provides auto-paginating iterators with each list response so you do not have to request successive pages manually.
Retries: Certain errors like 408, 409, 429, and >=500 are automatically retried 2 times by default, with a short exponential backoff.
Async/await support: Full async client with AsyncReplicate that supports all SDK methods including run() and stream().
Alternative HTTP backends: Optional aiohttp support for improved concurrency performance in async applications.
Streaming output: Stream model outputs in real-time with replicate.stream() for language models.
File upload flexibility: Pass files as URLs, file handles, bytes, PathLike objects, or tuples of (filename, contents, media_type).
Raw response access: Access response headers and raw data with .with_raw_response and .with_streaming_response.
Per-request configuration: Override client options on a per-request basis with .with_options().
Configurable timeouts: Fine-grained timeout control at the client or request level, including separate read/write/connect timeouts.
Better error handling: Specific exception types for different HTTP status codes (BadRequestError, AuthenticationError, RateLimitError, etc.) with access to underlying response data.
Manual pagination control: Granular page control with has_next_page(), next_page_info(), and get_next_page() methods.
Response serialization: Pydantic models with built-in .to_json() and .to_dict() methods.
Logging support: Built-in logging via REPLICATE_LOG environment variable.
Context manager support: Proper resource management with context managers for both sync and async clients.

For more details about all these new features, see the README.

Installation

To get started using the public beta pre-release, pip install it with the --pre flag:

pip install --pre replicate

Usage

For basic usage, the new 2.x client works just like the old 1.x client.

As always, you can import the library and run models with just a few lines of code:

import replicate
model = "anthropic/claude-4.5-sonnet"
prompt = "Write me a poem about SDKs"
output = replicate.run(model, input={"prompt": prompt})
print(output)

You can also stream output as the model is running:

import replicate
claude = replicate.use("anthropic/claude-4.5-sonnet", streaming=True)
for event in claude(prompt="Write a haiku about streaming output."):
 print(str(event), end="")

Want to try it out but don’t have Python at your disposal? Check out the Google Colab notebook.

See sdks.replicate.com/python for a complete summary of available methods.

Migrating from v1 to v2

v2 is a complete rewrite of the Python SDK, so there are user-facing breaking changes you should be aware of if you’re migrating an existing codebase from v1 to v2. We’ve published a guide to simplify the migration process:

https://github.com/replicate/replicate-python-beta/blob/main/UPGRADING.md

☝️ We recommend feeding this guide to an agent like Claude Code, OpenAI Codex, Gemini CLI, or Cursor to help you automate the process of migrating your project from v1 to v2.

Pinning to v1

v2 has lots of new bells and whistles, but you are not required to upgrade. If you’re already using the v1 version and want to continue using it, pin the version number in your dependency file.

See pinning in the migration guide for details.

Assets 2

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Python SDK 2.0.0 beta

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What’s new?

Running models

API design

🍰 High-level API

🛠️ Mid-level API

🔬Low-level API

New SDK features

Installation

Usage

Migrating from v1 to v2

Pinning to v1

Uh oh!