OpenAI compatibility
Gemini models are accessible using the OpenAI libraries (Python and TypeScript / Javascript) along with the REST API, by updating three lines of code and using your Gemini API key. If you aren't already using the OpenAI libraries, we recommend that you call the Gemini API directly.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{ "role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain to me how AI works"
}
]
)
print(response.choices[0].message)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[
{role:"system",
content:"You are a helpful assistant."
},
{
role:"user",
content:"Explain to me how AI works",
},
],
});
console.log(response.choices[0].message);
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer $GEMINI_API_KEY"\
-d'{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": "Explain to me how AI works"
}
]
}'
What changed? Just three lines!
api_key="GEMINI_API_KEY": Replace "GEMINI_API_KEY" with your actual Gemini API key, which you can get in Google AI Studio.base_url="https://generativelanguage.googleapis.com/v1beta/openai/": This tells the OpenAI library to send requests to the Gemini API endpoint instead of the default URL.model="gemini-2.5-flash": Choose a compatible Gemini model
Thinking
Gemini models are trained to think through complex problems, leading to significantly improved reasoning. The Gemini API comes with thinking parameters which give fine grain control over how much the model will think.
Different Gemini models have different reasoning configurations, you can see how they map to OpenAI's reasoning efforts as follows:
reasoning_effort (OpenAI) |
thinking_level (Gemini 3) |
thinking_budget (Gemini 2.5) |
|
|---|---|---|---|
| Pro | Flash | ||
minimal |
low |
minimal |
1,024 |
low |
low |
low |
1,024 |
medium |
ERROR, not supported |
medium |
8,192 |
high |
high |
high |
24,576 |
If no reasoning_effort is specified, Gemini uses the model's
default level or budget.
If you want to disable thinking, you can set reasoning_effort to "none" for
2.5 models. Reasoning cannot be turned off for Gemini 2.5 Pro or 3 models.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
model="gemini-2.5-flash",
reasoning_effort="low",
messages=[
{ "role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain to me how AI works"
}
]
)
print(response.choices[0].message)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
reasoning_effort:"low",
messages:[
{role:"system",
content:"You are a helpful assistant."
},
{
role:"user",
content:"Explain to me how AI works",
},
],
});
console.log(response.choices[0].message);
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer $GEMINI_API_KEY"\
-d'{
"model": "gemini-2.5-flash",
"reasoning_effort": "low",
"messages": [
{
"role": "user",
"content": "Explain to me how AI works"
}
]
}'
Gemini thinking models also produce thought summaries.
You can use the extra_body field to include Gemini fields
in your request.
Note that reasoning_effort and thinking_level/thinking_budget overlap
functionality, so they can't be used at the same time.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "Explain to me how AI works"}],
extra_body={
'extra_body': {
"google": {
"thinking_config": {
"thinking_budget": "low",
"include_thoughts": True
}
}
}
}
)
print(response.choices[0].message)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[{role:"user",content:"Explain to me how AI works",}],
extra_body:{
"google":{
"thinking_config":{
"thinking_budget":"low",
"include_thoughts":true
}
}
}
});
console.log(response.choices[0].message);
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
"model": "gemini-2.5-flash",
"messages": [{"role": "user", "content": "Explain to me how AI works"}],
"extra_body": {
"google": {
"thinking_config": {
"include_thoughts": true
}
}
}
}'
Gemini 3 supports OpenAI compatibility for thought signatures in chat completion APIs. You can find the full example on the thought signatures page.
Streaming
The Gemini API supports streaming responses.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{
"role": "system",
"content": "You are a helpful assistant."
},
{ "role": "user",
"content": "Hello!"
}
],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constcompletion=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:[
{
"role":"system",
"content":"You are a helpful assistant."
},
{
"role":"user",
"content":"Hello!"
}
],
stream:true,
});
forawait(constchunkofcompletion){
console.log(chunk.choices[0].delta.content);
}
}
main();
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Explain to me how AI works"}
],
"stream": true
}'
Function calling
Function calling makes it easier for you to get structured data outputs from generative models and is supported in the Gemini API.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Chicago, IL",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
}
]
messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=messages,
tools=tools,
tool_choice="auto"
)
print(response)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constmessages=[{"role":"user","content":"What's the weather like in Chicago today?"}];
consttools=[
{
"type":"function",
"function":{
"name":"get_weather",
"description":"Get the weather in a given location",
"parameters":{
"type":"object",
"properties":{
"location":{
"type":"string",
"description":"The city and state, e.g. Chicago, IL",
},
"unit":{"type":"string","enum":["celsius","fahrenheit"]},
},
"required":["location"],
},
}
}
];
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:messages,
tools:tools,
tool_choice:"auto",
});
console.log(response);
}
main();
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": "What'\''s the weather like in Chicago today?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Chicago, IL"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'
Image understanding
Gemini models are natively multimodal and provide best in class performance on many common vision tasks.
Python
importbase64
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
# Function to encode the image
defencode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
# Getting the base64 string
base64_image = encode_image("Path/to/agi/image.jpeg")
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?",
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
},
},
],
}
],
)
print(response.choices[0])
JavaScript
importOpenAIfrom"openai";
importfsfrom'fs/promises';
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionencodeImage(imagePath){
try{
constimageBuffer=awaitfs.readFile(imagePath);
returnimageBuffer.toString('base64');
}catch(error){
console.error("Error encoding image:",error);
returnnull;
}
}
asyncfunctionmain(){
constimagePath="Path/to/agi/image.jpeg";
constbase64Image=awaitencodeImage(imagePath);
constmessages=[
{
"role":"user",
"content":[
{
"type":"text",
"text":"What is in this image?",
},
{
"type":"image_url",
"image_url":{
"url":`data:image/jpeg;base64,${base64Image}`
},
},
],
}
];
try{
constresponse=awaitopenai.chat.completions.create({
model:"gemini-2.5-flash",
messages:messages,
});
console.log(response.choices[0]);
}catch(error){
console.error("Error calling Gemini API:",error);
}
}
main();
REST
bash-c'
base64_image=$(base64 -i "Path/to/agi/image.jpeg");
curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer GEMINI_API_KEY" \
-d "{
\"model\": \"gemini-2.5-flash\",
\"messages\": [
{
\"role\": \"user\",
\"content\": [
{ \"type\": \"text\", \"text\": \"What is in this image?\" },
{
\"type\": \"image_url\",
\"image_url\": { \"url\": \"data:image/jpeg;base64,${base64_image}\" }
}
]
}
]
}"
'
Generate an image
Generate an image:
Python
importbase64
fromopenaiimport OpenAI
fromPILimport Image
fromioimport BytesIO
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
response = client.images.generate(
model="imagen-3.0-generate-002",
prompt="a portrait of a sheepadoodle wearing a cape",
response_format='b64_json',
n=1,
)
for image_data in response.data:
image = Image.open(BytesIO(base64.b64decode(image_data.b64_json)))
image.show()
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constimage=awaitopenai.images.generate(
{
model:"imagen-3.0-generate-002",
prompt:"a portrait of a sheepadoodle wearing a cape",
response_format:"b64_json",
n:1,
}
);
console.log(image.data);
}
main();
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/images/generations"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
"model": "imagen-3.0-generate-002",
"prompt": "a portrait of a sheepadoodle wearing a cape",
"response_format": "b64_json",
"n": 1,
}'
Audio understanding
Analyze audio input:
Python
importbase64
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
with open("/path/to/your/audio/file.wav", "rb") as audio_file:
base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Transcribe this audio",
},
{
"type": "input_audio",
"input_audio": {
"data": base64_audio,
"format": "wav"
}
}
],
}
],
)
print(response.choices[0].message.content)
JavaScript
importfsfrom"fs";
importOpenAIfrom"openai";
constclient=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
constaudioFile=fs.readFileSync("/path/to/your/audio/file.wav");
constbase64Audio=Buffer.from(audioFile).toString("base64");
asyncfunctionmain(){
constresponse=awaitclient.chat.completions.create({
model:"gemini-5-flash",
messages:[
{
role:"user",
content:[
{
type:"text",
text:"Transcribe this audio",
},
{
type:"input_audio",
input_audio:{
data:base64Audio,
format:"wav",
},
},
],
},
],
});
console.log(response.choices[0].message.content);
}
main();
REST
bash-c'
base64_audio=$(base64 -i "/path/to/your/audio/file.wav");
curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer GEMINI_API_KEY" \
-d "{
\"model\": \"gemini-2.5-flash\",
\"messages\": [
{
\"role\": \"user\",
\"content\": [
{ \"type\": \"text\", \"text\": \"Transcribe this audio file.\" },
{
\"type\": \"input_audio\",
\"input_audio\": {
\"data\": \"${base64_audio}\",
\"format\": \"wav\"
}
}
]
}
]
}"
'
Structured output
Gemini models can output JSON objects in any structure you define.
Python
frompydanticimport BaseModel
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
classCalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
completion = client.beta.chat.completions.parse(
model="gemini-2.5-flash",
messages=[
{"role": "system", "content": "Extract the event information."},
{"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
],
response_format=CalendarEvent,
)
print(completion.choices[0].message.parsed)
JavaScript
importOpenAIfrom"openai";
import{zodResponseFormat}from"openai/helpers/zod";
import{z}from"zod";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai"
});
constCalendarEvent=z.object({
name:z.string(),
date:z.string(),
participants:z.array(z.string()),
});
constcompletion=awaitopenai.chat.completions.parse({
model:"gemini-2.5-flash",
messages:[
{role:"system",content:"Extract the event information."},
{role:"user",content:"John and Susan are going to an AI conference on Friday"},
],
response_format:zodResponseFormat(CalendarEvent,"event"),
});
constevent=completion.choices[0].message.parsed;
console.log(event);
Embeddings
Text embeddings measure the relatedness of text strings and can be generated using the Gemini API.
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.embeddings.create(
input="Your text string goes here",
model="gemini-embedding-001"
)
print(response.data[0].embedding)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/"
});
asyncfunctionmain(){
constembedding=awaitopenai.embeddings.create({
model:"gemini-embedding-001",
input:"Your text string goes here",
});
console.log(embedding);
}
main();
REST
curl"https://generativelanguage.googleapis.com/v1beta/openai/embeddings"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer GEMINI_API_KEY"\
-d'{
"input": "Your text string goes here",
"model": "gemini-embedding-001"
}'
Batch API
You can create batch jobs, submit them, and check their status using the OpenAI library.
You'll need to prepare the JSONL file in OpenAI input format. For example:
{"custom_id":"request-1","method":"POST","url":"/v1/chat/completions","body":{"model":"gemini-2.5-flash","messages":[{"role":"user","content":"Tell me a one-sentence joke."}]}}
{"custom_id":"request-2","method":"POST","url":"/v1/chat/completions","body":{"model":"gemini-2.5-flash","messages":[{"role":"user","content":"Why is the sky blue?"}]}}
OpenAI compatibility for Batch supports creating a batch, monitoring job status, and viewing batch results.
Compatibility for upload and download is currently not supported. Instead, the
following example uses the genai client for uploading and downloading
files, the same as when using the Gemini Batch API.
Python
fromopenaiimport OpenAI
# Regular genai client for uploads & downloads
fromgoogleimport genai
client = genai.Client()
openai_client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
# Upload the JSONL file in OpenAI input format, using regular genai SDK
uploaded_file = client.files.upload(
file='my-batch-requests.jsonl',
config=types.UploadFileConfig(display_name='my-batch-requests', mime_type='jsonl')
)
# Create batch
batch = openai_client.batches.create(
input_file_id=batch_input_file_id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
# Wait for batch to finish (up to 24h)
while True:
batch = client.batches.retrieve(batch.id)
if batch.status in ('completed', 'failed', 'cancelled', 'expired'):
break
print(f"Batch not finished. Current state: {batch.status}. Waiting 30 seconds...")
time.sleep(30)
print(f"Batch finished: {batch}")
# Download results in OpenAI output format, using regular genai SDK
file_content = genai_client.files.download(file=batch.output_file_id).decode('utf-8')
# See batch_output JSONL in OpenAI output format
for line in file_content.splitlines():
print(line)
The OpenAI SDK also supports generating embeddings with the Batch API. To do so, switch out the
create method's endpoint field for an embeddings endpoint, as well as the
url and model keys in the JSONL file:
# JSONL file using embeddings model and endpoint
# {"custom_id": "request-1", "method": "POST", "url": "/v1/embeddings", "body": {"model": "ggemini-embedding-001", "messages": [{"role": "user", "content": "Tell me a one-sentence joke."}]}}
# {"custom_id": "request-2", "method": "POST", "url": "/v1/embeddings", "body": {"model": "gemini-embedding-001", "messages": [{"role": "user", "content": "Why is the sky blue?"}]}}
# ...
# Create batch step with embeddings endpoint
batch = openai_client.batches.create(
input_file_id=batch_input_file_id,
endpoint="/v1/embeddings",
completion_window="24h"
)
See the Batch embedding generation section of the OpenAI compatibility cookbook for a complete example.
extra_body
There are several features supported by Gemini that are not available in OpenAI
models but can be enabled using the extra_body field.
extra_body features
cached_content
Corresponds to Gemini's GenerateContentRequest.cached_content.
thinking_config
Corresponds to Gemini's ThinkingConfig.
cached_content
Here's an example of using extra_body to set cached_content:
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key=MY_API_KEY,
base_url="https://generativelanguage.googleapis.com/v1beta/"
)
stream = client.chat.completions.create(
model="gemini-2.5-pro",
n=1,
messages=[
{
"role": "user",
"content": "Summarize the video"
}
],
stream=True,
stream_options={'include_usage': True},
extra_body={
'extra_body':
{
'google': {
'cached_content': "cachedContents/0000aaaa1111bbbb2222cccc3333dddd4444eeee"
}
}
}
)
for chunk in stream:
print(chunk)
print(chunk.usage.to_dict())
List models
Get a list of available Gemini models:
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
models = client.models.list()
for model in models:
print(model.id)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constlist=awaitopenai.models.list();
forawait(constmodeloflist){
console.log(model);
}
}
main();
REST
curlhttps://generativelanguage.googleapis.com/v1beta/openai/models\
-H"Authorization: Bearer GEMINI_API_KEY"
Retrieve a model
Retrieve a Gemini model:
Python
fromopenaiimport OpenAI
client = OpenAI(
api_key="GEMINI_API_KEY",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
model = client.models.retrieve("gemini-2.5-flash")
print(model.id)
JavaScript
importOpenAIfrom"openai";
constopenai=newOpenAI({
apiKey:"GEMINI_API_KEY",
baseURL:"https://generativelanguage.googleapis.com/v1beta/openai/",
});
asyncfunctionmain(){
constmodel=awaitopenai.models.retrieve("gemini-2.5-flash");
console.log(model.id);
}
main();
REST
curlhttps://generativelanguage.googleapis.com/v1beta/openai/models/gemini-2.5-flash\
-H"Authorization: Bearer GEMINI_API_KEY"
Current limitations
Support for the OpenAI libraries is still in beta while we extend feature support.
If you have questions about supported parameters, upcoming features, or run into any issues getting started with Gemini, join our Developer Forum.
What's next
Try our OpenAI Compatibility Colab to work through more detailed examples.