OpenAI compatibility

Gemini models are accessible using the OpenAI libraries (Python and TypeScript / Javascript) along with the REST API, by updating three lines of code and using your Gemini API key. If you aren't already using the OpenAI libraries, we recommend that you call the Gemini API directly.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=[
 { "role": "system",
 "content": "You are a helpful assistant."
 },
 {
 "role": "user",
 "content": "Explain to me how AI works"
 }
 ]
)
print(response.choices[0].message)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[
{role:"system",
content:"You are a helpful assistant."
},
{
role:"user",
content:"Explain to me how AI works",
},
],
});
console.log(response.choices[0].message);

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer $GEMINI_API_KEY"\
-d'{
 "model": "gemini-2.5-flash",
 "messages": [
 {
 "role": "user",
 "content": "Explain to me how AI works"
 }
 ]
 }'

What changed? Just three lines!

api_key="GEMINI_API_KEY": Replace "GEMINI_API_KEY" with your actual Gemini API key, which you can get in Google AI Studio.
base_url="https://generativelanguage.googleapis.com/v1beta/openai/": This tells the OpenAI library to send requests to the Gemini API endpoint instead of the default URL.
model="gemini-2.5-flash": Choose a compatible Gemini model

Thinking

Gemini models are trained to think through complex problems, leading to significantly improved reasoning. The Gemini API comes with thinking parameters which give fine grain control over how much the model will think.

Different Gemini models have different reasoning configurations, you can see how they map to OpenAI's reasoning efforts as follows:

`reasoning_effort` (OpenAI)	`thinking_level` (Gemini 3)		`thinking_budget` (Gemini 2.5)
`reasoning_effort` (OpenAI)	Pro	Flash	`thinking_budget` (Gemini 2.5)
`minimal`	`low`	`minimal`	`1,024`
`low`	`low`	`low`	`1,024`
`medium`	`ERROR, not supported`	`medium`	`8,192`
`high`	`high`	`high`	`24,576`

If no reasoning_effort is specified, Gemini uses the model's default level or budget.

If you want to disable thinking, you can set reasoning_effort to "none" for 2.5 models. Reasoning cannot be turned off for Gemini 2.5 Pro or 3 models.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 reasoning_effort="low",
 messages=[
 { "role": "system",
 "content": "You are a helpful assistant."
 },
 {
 "role": "user",
 "content": "Explain to me how AI works"
 }
 ]
)
print(response.choices[0].message)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
reasoning_effort:"low",
messages:[
{role:"system",
content:"You are a helpful assistant."
},
{
role:"user",
content:"Explain to me how AI works",
},
],
});
console.log(response.choices[0].message);

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer $GEMINI_API_KEY"\
-d'{
 "model": "gemini-2.5-flash",
 "reasoning_effort": "low",
 "messages": [
 {
 "role": "user",
 "content": "Explain to me how AI works"
 }
 ]
 }'

Gemini thinking models also produce thought summaries. You can use the extra_body field to include Gemini fields in your request.

Note that reasoning_effort and thinking_level/thinking_budget overlap functionality, so they can't be used at the same time.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=[{"role": "user", "content": "Explain to me how AI works"}],
 extra_body={
 'extra_body': {
 "google": {
 "thinking_config": {
 "thinking_budget": "low",
 "include_thoughts": True
 }
 }
 }
 }
)
print(response.choices[0].message)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[{role:"user",content:"Explain to me how AI works",}],
extra_body:{
"google":{
"thinking_config":{
"thinking_budget":"low",
"include_thoughts":true
}
}
}
});
console.log(response.choices[0].message);

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
 "model": "gemini-2.5-flash",
 "messages": [{"role": "user", "content": "Explain to me how AI works"}],
 "extra_body": {
 "google": {
 "thinking_config": {
 "include_thoughts": true
 }
 }
 }
 }'

Gemini 3 supports OpenAI compatibility for thought signatures in chat completion APIs. You can find the full example on the thought signatures page.

Streaming

The Gemini API supports streaming responses.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=[
 {
 "role": "system",
 "content": "You are a helpful assistant."
 },
 { "role": "user",
 "content": "Hello!"
 }
 ],
 stream=True
)
for chunk in response:
 print(chunk.choices[0].delta)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constcompletion=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[
{
"role":"system",
"content":"You are a helpful assistant."
},
{
"role":"user",
"content":"Hello!"
}
],
stream:true,
});
forawait(constchunkofcompletion){
console.log(chunk.choices[0].delta.content);
}
}
main();

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
 "model": "gemini-2.5-flash",
 "messages": [
 {"role": "user", "content": "Explain to me how AI works"}
 ],
 "stream": true
 }'

Function calling

Function calling makes it easier for you to get structured data outputs from generative models and is supported in the Gemini API.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
tools = [
 {
 "type": "function",
 "function": {
 "name": "get_weather",
 "description": "Get the weather in a given location",
 "parameters": {
 "type": "object",
 "properties": {
 "location": {
 "type": "string",
 "description": "The city and state, e.g. Chicago, IL",
 },
 "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
 },
 "required": ["location"],
 },
 }
 }
]
messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=messages,
 tools=tools,
 tool_choice="auto"
)
print(response)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constmessages=[{"role":"user","content":"What's the weather like in Chicago today?"}];
consttools=[
{
"type":"function",
"function":{
"name":"get_weather",
"description":"Get the weather in a given location",
"parameters":{
"type":"object",
"properties":{
"location":{
"type":"string",
"description":"The city and state, e.g. Chicago, IL",
},
"unit":{"type":"string","enum":["celsius","fahrenheit"]},
},
"required":["location"],
},
}
}
];
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:messages,
tools:tools,
tool_choice:"auto",
});
console.log(response);
}
main();

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
 "model": "gemini-2.5-flash",
 "messages": [
 {
 "role": "user",
 "content": "What'\''s the weather like in Chicago today?"
 }
 ],
 "tools": [
 {
 "type": "function",
 "function": {
 "name": "get_weather",
 "description": "Get the current weather in a given location",
 "parameters": {
 "type": "object",
 "properties": {
 "location": {
 "type": "string",
 "description": "The city and state, e.g. Chicago, IL"
 },
 "unit": {
 "type": "string",
 "enum": ["celsius", "fahrenheit"]
 }
 },
 "required": ["location"]
 }
 }
 }
 ],
 "tool_choice": "auto"
}'

Image understanding

Gemini models are natively multimodal and provide best in class performance on many common vision tasks.

Python

importbase64
fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
# Function to encode the image
defencode_image(image_path):
 with open(image_path, "rb") as image_file:
 return base64.b64encode(image_file.read()).decode('utf-8')
# Getting the base64 string
base64_image = encode_image("Path/to/agi/image.jpeg")
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=[
 {
 "role": "user",
 "content": [
 {
 "type": "text",
 "text": "What is in this image?",
 },
 {
 "type": "image_url",
 "image_url": {
 "url": f"data:image/jpeg;base64,{base64_image}"
 },
 },
 ],
 }
 ],
)
print(response.choices[0])

JavaScript

importOpenAIfrom"openai";
importfsfrom'fs/promises';
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionencodeImage(imagePath){
try{
constimageBuffer=awaitfs.readFile(imagePath);
returnimageBuffer.toString('base64');
}catch(error){
console.error("Error encoding image:",error);
returnnull;
}
}
asyncfunctionmain(){
constimagePath="Path/to/agi/image.jpeg";
constbase64Image=awaitencodeImage(imagePath);
constmessages=[
{
"role":"user",
"content":[
{
"type":"text",
"text":"What is in this image?",
},
{
"type":"image_url",
"image_url":{
"url":`data:image/jpeg;base64,${base64Image}`
},
},
],
}
];
try{
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:messages,
});
console.log(response.choices[0]);
}catch(error){
console.error("Error calling Gemini API:",error);
}
}
main();

REST

bash-c'
 base64_image=$(base64 -i "Path/to/agi/image.jpeg");
 curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer GEMINI_API_KEY" \
 -d "{
 \"model\": \"gemini-2.5-flash\",
 \"messages\": [
 {
 \"role\": \"user\",
 \"content\": [
 { \"type\": \"text\", \"text\": \"What is in this image?\" },
 {
 \"type\": \"image_url\",
 \"image_url\": { \"url\": \"data:image/jpeg;base64,${base64_image}\" }
 }
 ]
 }
 ]
 }"
'

Generate an image

Generate an image:

Python

importbase64
fromopenaiimport OpenAI
fromPILimport Image
fromioimport BytesIO
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
response = client.images.generate(
 model="imagen-3.0-generate-002",
 prompt="a portrait of a sheepadoodle wearing a cape",
 response_format='b64_json',
 n=1,
)
for image_data in response.data:
 image = Image.open(BytesIO(base64.b64decode(image_data.b64_json)))
 image.show()

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constimage=awaitopenai.images.generate(
{
model:"imagen-3.0-generate-002",
prompt:"a portrait of a sheepadoodle wearing a cape",
response_format:"b64_json",
n:1,
}
);
console.log(image.data);
}
main();

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/images/generations"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
 "model": "imagen-3.0-generate-002",
 "prompt": "a portrait of a sheepadoodle wearing a cape",
 "response_format": "b64_json",
 "n": 1,
 }'

Audio understanding

Analyze audio input:

Python

importbase64
fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
with open("/path/to/your/audio/file.wav", "rb") as audio_file:
 base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 messages=[
 {
 "role": "user",
 "content": [
 {
 "type": "text",
 "text": "Transcribe this audio",
 },
 {
 "type": "input_audio",
 "input_audio": {
 "data": base64_audio,
 "format": "wav"
 }
 }
 ],
 }
 ],
)
print(response.choices[0].message.content)

JavaScript

importfsfrom"fs";
importOpenAIfrom"openai";
constclient=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
constaudioFile=fs.readFileSync("/path/to/your/audio/file.wav");
constbase64Audio=Buffer.from(audioFile).toString("base64");
asyncfunctionmain(){
constresponse=awaitclient.chat.completions.create({
model:"gemini-5-flash",
messages:[
{
role:"user",
content:[
{
type:"text",
text:"Transcribe this audio",
},
{
type:"input_audio",
input_audio:{
data:base64Audio,
format:"wav",
},
},
],
},
],
});
console.log(response.choices[0].message.content);
}
main();

REST

bash-c'
 base64_audio=$(base64 -i "/path/to/your/audio/file.wav");
 curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer GEMINI_API_KEY" \
 -d "{
 \"model\": \"gemini-2.5-flash\",
 \"messages\": [
 {
 \"role\": \"user\",
 \"content\": [
 { \"type\": \"text\", \"text\": \"Transcribe this audio file.\" },
 {
 \"type\": \"input_audio\",
 \"input_audio\": {
 \"data\": \"${base64_audio}\",
 \"format\": \"wav\"
 }
 }
 ]
 }
 ]
 }"
'

Structured output

Gemini models can output JSON objects in any structure you define.

Python

frompydanticimport BaseModel
fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
classCalendarEvent(BaseModel):
 name: str
 date: str
 participants: list[str]
completion = client.beta.chat.completions.parse(
 model="gemini-2.5-flash",
 messages=[
 {"role": "system", "content": "Extract the event information."},
 {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
 ],
 response_format=CalendarEvent,
)
print(completion.choices[0].message.parsed)

JavaScript

importOpenAIfrom"openai";
import{zodResponseFormat}from"openai/helpers/zod";
import{z}from"zod";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai"
});
constCalendarEvent=z.object({
name:z.string(),
date:z.string(),
participants:z.array(z.string()),
});
constcompletion=awaitopenai.chat.completions.parse({
model:"gemini-2.5-flash",
messages:[
{role:"system",content:"Extract the event information."},
{role:"user",content:"John and Susan are going to an AI conference on Friday"},
],
response_format:zodResponseFormat(CalendarEvent,"event"),
});
constevent=completion.choices[0].message.parsed;
console.log(event);

Embeddings

Text embeddings measure the relatedness of text strings and can be generated using the Gemini API.

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.embeddings.create(
 input="Your text string goes here",
 model="gemini-embedding-001"
)
print(response.data[0].embedding)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constembedding=awaitopenai.embeddings.create({
model:"gemini-embedding-001",
input:"Your text string goes here",
});
console.log(embedding);
}
main();

REST

curl"https://generativelanguage.googleapis.com/v1beta/openai/embeddings"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
 "input": "Your text string goes here",
 "model": "gemini-embedding-001"
 }'

Batch API

You can create batch jobs, submit them, and check their status using the OpenAI library.

You'll need to prepare the JSONL file in OpenAI input format. For example:

{"custom_id":"request-1","method":"POST","url":"/v1/chat/completions","body":{"model":"gemini-2.5-flash","messages":[{"role":"user","content":"Tell me a one-sentence joke."}]}}
{"custom_id":"request-2","method":"POST","url":"/v1/chat/completions","body":{"model":"gemini-2.5-flash","messages":[{"role":"user","content":"Why is the sky blue?"}]}}

OpenAI compatibility for Batch supports creating a batch, monitoring job status, and viewing batch results.

Compatibility for upload and download is currently not supported. Instead, the following example uses the genai client for uploading and downloading files, the same as when using the Gemini Batch API.

Python

fromopenaiimport OpenAI
# Regular genai client for uploads & downloads
fromgoogleimport genai
client = genai.Client()
openai_client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
# Upload the JSONL file in OpenAI input format, using regular genai SDK
uploaded_file = client.files.upload(
 file='my-batch-requests.jsonl',
 config=types.UploadFileConfig(display_name='my-batch-requests', mime_type='jsonl')
)
# Create batch
batch = openai_client.batches.create(
 input_file_id=batch_input_file_id,
 endpoint="/v1/chat/completions",
 completion_window="24h"
)
# Wait for batch to finish (up to 24h)
while True:
 batch = client.batches.retrieve(batch.id)
 if batch.status in ('completed', 'failed', 'cancelled', 'expired'):
 break
 print(f"Batch not finished. Current state: {batch.status}. Waiting 30 seconds...")
 time.sleep(30)
print(f"Batch finished: {batch}")
# Download results in OpenAI output format, using regular genai SDK
file_content = genai_client.files.download(file=batch.output_file_id).decode('utf-8')
# See batch_output JSONL in OpenAI output format
for line in file_content.splitlines():
 print(line)

The OpenAI SDK also supports generating embeddings with the Batch API. To do so, switch out the create method's endpoint field for an embeddings endpoint, as well as the url and model keys in the JSONL file:

# JSONL file using embeddings model and endpoint
# {"custom_id": "request-1", "method": "POST", "url": "/v1/embeddings", "body": {"model": "ggemini-embedding-001", "messages": [{"role": "user", "content": "Tell me a one-sentence joke."}]}}
# {"custom_id": "request-2", "method": "POST", "url": "/v1/embeddings", "body": {"model": "gemini-embedding-001", "messages": [{"role": "user", "content": "Why is the sky blue?"}]}}
# ...
# Create batch step with embeddings endpoint
batch = openai_client.batches.create(
 input_file_id=batch_input_file_id,
 endpoint="/v1/embeddings",
 completion_window="24h"
)

See the Batch embedding generation section of the OpenAI compatibility cookbook for a complete example.

`extra_body`

There are several features supported by Gemini that are not available in OpenAI models but can be enabled using the extra_body field.

extra_body features

cached_content Corresponds to Gemini's GenerateContentRequest.cached_content.

thinking_config Corresponds to Gemini's ThinkingConfig.

`cached_content`

Here's an example of using extra_body to set cached_content:

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key=MY_API_KEY,
 base_url="https://generativelanguage.googleapis.com/v1beta/"
)
stream = client.chat.completions.create(
 model="gemini-2.5-pro",
 n=1,
 messages=[
 {
 "role": "user",
 "content": "Summarize the video"
 }
 ],
 stream=True,
 stream_options={'include_usage': True},
 extra_body={
 'extra_body':
 {
 'google': {
 'cached_content': "cachedContents/0000aaaa1111bbbb2222cccc3333dddd4444eeee"
 }
 }
 }
)
for chunk in stream:
 print(chunk)
 print(chunk.usage.to_dict())

List models

Get a list of available Gemini models:

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
models = client.models.list()
for model in models:
 print(model.id)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constlist=awaitopenai.models.list();
forawait(constmodeloflist){
console.log(model);
}
}
main();

REST

curlhttps://generativelanguage.googleapis.com/v1beta/openai/models\
-H"Authorization: Bearer GEMINI_API_KEY"

Retrieve a model

Retrieve a Gemini model:

Python

fromopenaiimport OpenAI
client = OpenAI(
 api_key="GEMINI_API_KEY",
 base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
model = client.models.retrieve("gemini-2.5-flash")
print(model.id)

JavaScript

importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constmodel=awaitopenai.models.retrieve("gemini-2.5-flash");
console.log(model.id);
}
main();

REST

curlhttps://generativelanguage.googleapis.com/v1beta/openai/models/gemini-2.5-flash\
-H"Authorization: Bearer GEMINI_API_KEY"

Current limitations

Support for the OpenAI libraries is still in beta while we extend feature support.

If you have questions about supported parameters, upcoming features, or run into any issues getting started with Gemini, join our Developer Forum.

What's next

Try our OpenAI Compatibility Colab to work through more detailed examples.