All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite). Learn more.

Generate & edit images using Gemini (aka "nano banana")


You can ask a Gemini model to generate and edit images using both text-only and text-and-image prompts. When you use Firebase AI Logic, you can make this request directly from your app.

With this capability, you can do things like:

  • Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context.

  • Generate images with high-quality text rendering, including long strings of text.

  • Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models.

  • Generate images using Gemini's world knowledge and reasoning capabilities.

You can find a complete list of supported modalities and capabilities (along with example prompts) later on this page.

Jump to code for text-to-image Jump to code for interleaved text & images

Jump to code for image editing Jump to code for iterative image editing


See other guides for additional options for working with images
Analyze images Analyze images on-device Generate structured output

Choosing between Gemini and Imagen models

The Firebase AI Logic SDKs support image generation and editing using either a Gemini model or an Imagen model.

For most use cases, start with Gemini, and then choose Imagen only for specialized tasks where image quality is critical.

Choose Gemini when you want:

  • To use world knowledge and reasoning to generate contextually relevant images.
  • To seamlessly blend text and images or to interleave text and image output.
  • To embed accurate visuals within long text sequences.
  • To edit images conversationally while maintaining context.

Choose Imagen when you want:

  • To prioritize image quality, photorealism, artistic detail, or specific styles (for example, impressionism or anime).
  • To infuse branding, style, or generation of logos and product designs.
  • To explicitly specify the aspect ratio or format of generated images.

Before you begin

Click your Gemini API provider to view provider-specific content and code on this page.

If you haven't already, complete the getting started guide, which describes how to set up your Firebase project, connect your app to Firebase, add the SDK, initialize the backend service for your chosen Gemini API provider, and create a GenerativeModel instance.

For testing and iterating on your prompts, we recommend using Google AI Studio.

Models that support this capability

  • gemini-2.5-flash-image (aka "nano banana").

Note that the SDKs also support image generation using Imagen models.

Generate and edit images

You can generate and edit images using a Gemini model.

Generate images (text-only input)

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can ask a Gemini model to generate images by prompting with text.

Make sure to create a GenerativeModel instance, include responseModalities: ["TEXT", "IMAGE"] in your model configuration, and call generateContent.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
letgenerativeModel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[.text,.image])
)
// Provide a text prompt instructing the model to generate an image
letprompt="Generate an image of the Eiffel tower with fireworks in the background."
// To generate an image, call `generateContent` with the text input
letresponse=tryawaitmodel.generateContent(prompt)
// Handle the generated image
guardletinlineDataPart=response.inlineDataParts.firstelse{
fatalError("No image data in response.")
}
guardletuiImage=UIImage(data:inlineDataPart.data)else{
fatalError("Failed to convert data to UIImage.")
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(
modelName="gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig=generationConfig{
responseModalities=listOf(ResponseModality.TEXT,ResponseModality.IMAGE)}
)
// Provide a text prompt instructing the model to generate an image
valprompt="Generate an image of the Eiffel tower with fireworks in the background."
// To generate image output, call `generateContent` with the text input
valgeneratedImageAsBitmap=model.generateContent(prompt)
// Handle the generated image
.candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
newGenerationConfig.Builder()
.setResponseModalities(Arrays.asList(ResponseModality.TEXT,ResponseModality.IMAGE))
.build()
);
GenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);
// Provide a text prompt instructing the model to generate an image
Contentprompt=newContent.Builder()
.addText("Generate an image of the Eiffel Tower with fireworks in the background.")
.build();
// To generate an image, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>response=model.generateContent(prompt);
Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){
@Override
publicvoidonSuccess(GenerateContentResponseresult){
// iterate over all the parts in the first candidate in the result object
for(Partpart:result.getCandidates().get(0).getContent().getParts()){
if(partinstanceofImagePart){
ImagePartimagePart=(ImagePart)part;
// The returned image as a bitmap
BitmapgeneratedImageAsBitmap=imagePart.getImage();
break;
}
}
}
@Override
publicvoidonFailure(Throwablet){
t.printStackTrace();
}
},executor);

Web


import{initializeApp}from"firebase/app";
import{getAI,getGenerativeModel,GoogleAIBackend,ResponseModality}from"firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
constfirebaseConfig={
// ...
};
// Initialize FirebaseApp
constfirebaseApp=initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `GenerativeModel` instance with a model that supports your use case
constmodel=getGenerativeModel(ai,{
model:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:{
responseModalities:[ResponseModality.TEXT,ResponseModality.IMAGE],
},
});
// Provide a text prompt instructing the model to generate an image
constprompt='Generate an image of the Eiffel Tower with fireworks in the background.';
// To generate an image, call `generateContent` with the text input
constresult=model.generateContent(prompt);
// Handle the generated image
try{
constinlineDataParts=result.response.inlineDataParts();
if(inlineDataParts?.[0]){
constimage=inlineDataParts[0].inlineData;
console.log(image.mimeType,image.data);
}
}catch(err){
console.error('Prompt or candidate was blocked:',err);
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
finalmodel=FirebaseAI.googleAI().generativeModel(
model:'gemini-2.5-flash-image',
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[ResponseModalities.text,ResponseModalities.image]),
);
// Provide a text prompt instructing the model to generate an image
finalprompt=[Content.text('Generate an image of the Eiffel Tower with fireworks in the background.')];
// To generate an image, call `generateContent` with the text input
finalresponse=awaitmodel.generateContent(prompt);
if(response.inlineDataParts.isNotEmpty){
finalimageBytes=response.inlineDataParts[0].bytes;
// Process the image
}else{
// Handle the case where no images were generated
print('Error: No images were generated.');
}

Unity


usingFirebase;
usingFirebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:newGenerationConfig(
responseModalities:new[]{ResponseModality.Text,ResponseModality.Image})
);
// Provide a text prompt instructing the model to generate an image
varprompt="Generate an image of the Eiffel Tower with fireworks in the background.";
// To generate an image, call `GenerateContentAsync` with the text input
varresponse=awaitmodel.GenerateContentAsync(prompt);
vartext=response.Text;
if(!string.IsNullOrWhiteSpace(text)){
// Do something with the text
}
// Handle the generated image
varimageParts=response.Candidates.First().Content.Parts
.OfType<ModelContent.InlineDataPart>()
.Where(part=>part.MimeType=="image/png");
foreach(varimagePartinimageParts){
// Load the Image into a Unity Texture2D object
UnityEngine.Texture2Dtexture2D=new(2,2);
if(texture2D.LoadImage(imagePart.Data.ToArray())){
// Do something with the image
}
}

Generate interleaved images and text

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can ask a Gemini model to generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like along with the step's instructions, and you don't have to make separate requests to the model or different models.

Make sure to create a GenerativeModel instance, include responseModalities: ["TEXT", "IMAGE"] in your model configuration, and call generateContent.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
letgenerativeModel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[.text,.image])
)
// Provide a text prompt instructing the model to generate interleaved text and images
letprompt="""
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
// To generate interleaved text and images, call `generateContent` with the text input
letresponse=tryawaitmodel.generateContent(prompt)
// Handle the generated text and image
guardletcandidate=response.candidates.firstelse{
fatalError("No candidates in response.")
}
forpartincandidate.content.parts{
switchpart{
caselettextPartasTextPart:
// Do something with the generated text
lettext=textPart.text
caseletinlineDataPartasInlineDataPart:
// Do something with the generated image
guardletuiImage=UIImage(data:inlineDataPart.data)else{
fatalError("Failed to convert data to UIImage.")
}
default:
fatalError("Unsupported part type: \(part)")
}
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(
modelName="gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig=generationConfig{
responseModalities=listOf(ResponseModality.TEXT,ResponseModality.IMAGE)}
)
// Provide a text prompt instructing the model to generate interleaved text and images
valprompt="""
 Generate an illustrated recipe for a paella.
 Create images to go alongside the text as you generate the recipe
 """.trimIndent()
// To generate interleaved text and images, call `generateContent` with the text input
valresponseContent=model.generateContent(prompt).candidates.first().content
// The response will contain image and text parts interleaved
for(partinresponseContent.parts){
when(part){
isImagePart->{
// ImagePart as a bitmap
valgeneratedImageAsBitmap:Bitmap? =part.asImageOrNull()
}
isTextPart->{
// Text content from the TextPart
valtext=part.text
}
}
}

Java


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
newGenerationConfig.Builder()
.setResponseModalities(Arrays.asList(ResponseModality.TEXT,ResponseModality.IMAGE))
.build()
);
GenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);
// Provide a text prompt instructing the model to generate interleaved text and images
Contentprompt=newContent.Builder()
.addText("Generate an illustrated recipe for a paella.\n"+
"Create images to go alongside the text as you generate the recipe")
.build();
// To generate interleaved text and images, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>response=model.generateContent(prompt);
Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){
@Override
publicvoidonSuccess(GenerateContentResponseresult){
ContentresponseContent=result.getCandidates().get(0).getContent();
// The response will contain image and text parts interleaved
for(Partpart:responseContent.getParts()){
if(partinstanceofImagePart){
// ImagePart as a bitmap
BitmapgeneratedImageAsBitmap=((ImagePart)part).getImage();
}elseif(partinstanceofTextPart){
// Text content from the TextPart
Stringtext=((TextPart)part).getText();
}
}
}
@Override
publicvoidonFailure(Throwablet){
System.err.println(t);
}
},executor);

Web


import{initializeApp}from"firebase/app";
import{getAI,getGenerativeModel,GoogleAIBackend,ResponseModality}from"firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
constfirebaseConfig={
// ...
};
// Initialize FirebaseApp
constfirebaseApp=initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `GenerativeModel` instance with a model that supports your use case
constmodel=getGenerativeModel(ai,{
model:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:{
responseModalities:[ResponseModality.TEXT,ResponseModality.IMAGE],
},
});
// Provide a text prompt instructing the model to generate interleaved text and images
constprompt='Generate an illustrated recipe for a paella.\n.'+
'Create images to go alongside the text as you generate the recipe';
// To generate interleaved text and images, call `generateContent` with the text input
constresult=awaitmodel.generateContent(prompt);
// Handle the generated text and image
try{
constresponse=result.response;
if(response.candidates?.[0].content?.parts){
for(constpartofresponse.candidates?.[0].content?.parts){
if(part.text){
// Do something with the text
console.log(part.text)
}
if(part.inlineData){
// Do something with the image
constimage=part.inlineData;
console.log(image.mimeType,image.data);
}
}
}
}catch(err){
console.error('Prompt or candidate was blocked:',err);
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
finalmodel=FirebaseAI.googleAI().generativeModel(
model:'gemini-2.5-flash-image',
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[ResponseModalities.text,ResponseModalities.image]),
);
// Provide a text prompt instructing the model to generate interleaved text and images
finalprompt=[Content.text(
'Generate an illustrated recipe for a paella\n '+
'Create images to go alongside the text as you generate the recipe'
)];
// To generate interleaved text and images, call `generateContent` with the text input
finalresponse=awaitmodel.generateContent(prompt);
// Handle the generated text and image
finalparts=response.candidates.firstOrNull?.content.parts
if(parts.isNotEmpty){
for(finalpartinparts){
if(partisTextPart){
// Do something with text part
finaltext=part.text
}
if(partisInlineDataPart){
// Process image
finalimageBytes=part.bytes
}
}
}else{
// Handle the case where no images were generated
print('Error: No images were generated.');
}

Unity


usingFirebase;
usingFirebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:newGenerationConfig(
responseModalities:new[]{ResponseModality.Text,ResponseModality.Image})
);
// Provide a text prompt instructing the model to generate interleaved text and images
varprompt="Generate an illustrated recipe for a paella \n"+
"Create images to go alongside the text as you generate the recipe";
// To generate interleaved text and images, call `GenerateContentAsync` with the text input
varresponse=awaitmodel.GenerateContentAsync(prompt);
// Handle the generated text and image
foreach(varpartinresponse.Candidates.First().Content.Parts){
if(partisModelContent.TextParttextPart){
if(!string.IsNullOrWhiteSpace(textPart.Text)){
// Do something with the text
}
}elseif(partisModelContent.InlineDataPartdataPart){
if(dataPart.MimeType=="image/png"){
// Load the Image into a Unity Texture2D object
UnityEngine.Texture2Dtexture2D=new(2,2);
if(texture2D.LoadImage(dataPart.Data.ToArray())){
// Do something with the image
}
}
}
}

Edit images (text-and-image input)

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can ask a Gemini model to edit images by prompting with text and one or more images.

Make sure to create a GenerativeModel instance, include responseModalities: ["TEXT", "IMAGE"] in your model configuration, and call generateContent.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
letgenerativeModel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[.text,.image])
)
// Provide an image for the model to edit
guardletimage=UIImage(named:"scones")else{fatalError("Image file not found.")}
// Provide a text prompt instructing the model to edit the image
letprompt="Edit this image to make it look like a cartoon"
// To edit the image, call `generateContent` with the image and text input
letresponse=tryawaitmodel.generateContent(image,prompt)
// Handle the generated image
guardletinlineDataPart=response.inlineDataParts.firstelse{
fatalError("No image data in response.")
}
guardletuiImage=UIImage(data:inlineDataPart.data)else{
fatalError("Failed to convert data to UIImage.")
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(
modelName="gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig=generationConfig{
responseModalities=listOf(ResponseModality.TEXT,ResponseModality.IMAGE)}
)
// Provide an image for the model to edit
valbitmap=BitmapFactory.decodeResource(context.resources,R.drawable.scones)
// Provide a text prompt instructing the model to edit the image
valprompt=content{
image(bitmap)
text("Edit this image to make it look like a cartoon")
}
// To edit the image, call `generateContent` with the prompt (image and text input)
valgeneratedImageAsBitmap=model.generateContent(prompt)
// Handle the generated text and image
.candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
newGenerationConfig.Builder()
.setResponseModalities(Arrays.asList(ResponseModality.TEXT,ResponseModality.IMAGE))
.build()
);
GenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);
// Provide an image for the model to edit
Bitmapbitmap=BitmapFactory.decodeResource(resources,R.drawable.scones);
// Provide a text prompt instructing the model to edit the image
Contentpromptcontent=newContent.Builder()
.addImage(bitmap)
.addText("Edit this image to make it look like a cartoon")
.build();
// To edit the image, call `generateContent` with the prompt (image and text input)
ListenableFuture<GenerateContentResponse>response=model.generateContent(promptcontent);
Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){
@Override
publicvoidonSuccess(GenerateContentResponseresult){
// iterate over all the parts in the first candidate in the result object
for(Partpart:result.getCandidates().get(0).getContent().getParts()){
if(partinstanceofImagePart){
ImagePartimagePart=(ImagePart)part;
BitmapgeneratedImageAsBitmap=imagePart.getImage();
break;
}
}
}
@Override
publicvoidonFailure(Throwablet){
t.printStackTrace();
}
},executor);

Web


import{initializeApp}from"firebase/app";
import{getAI,getGenerativeModel,GoogleAIBackend,ResponseModality}from"firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
constfirebaseConfig={
// ...
};
// Initialize FirebaseApp
constfirebaseApp=initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `GenerativeModel` instance with a model that supports your use case
constmodel=getGenerativeModel(ai,{
model:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:{
responseModalities:[ResponseModality.TEXT,ResponseModality.IMAGE],
},
});
// Prepare an image for the model to edit
asyncfunctionfileToGenerativePart(file){
constbase64EncodedDataPromise=newPromise((resolve)=>{
constreader=newFileReader();
reader.onloadend=()=>resolve(reader.result.split(',')[1]);
reader.readAsDataURL(file);
});
return{
inlineData:{data:awaitbase64EncodedDataPromise,mimeType:file.type},
};
}
// Provide a text prompt instructing the model to edit the image
constprompt="Edit this image to make it look like a cartoon";
constfileInputEl=document.querySelector("input[type=file]");
constimagePart=awaitfileToGenerativePart(fileInputEl.files[0]);
// To edit the image, call `generateContent` with the image and text input
constresult=awaitmodel.generateContent([prompt,imagePart]);
// Handle the generated image
try{
constinlineDataParts=result.response.inlineDataParts();
if(inlineDataParts?.[0]){
constimage=inlineDataParts[0].inlineData;
console.log(image.mimeType,image.data);
}
}catch(err){
console.error('Prompt or candidate was blocked:',err);
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
finalmodel=FirebaseAI.googleAI().generativeModel(
model:'gemini-2.5-flash-image',
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[ResponseModalities.text,ResponseModalities.image]),
);
// Prepare an image for the model to edit
finalimage=awaitFile('scones.jpg').readAsBytes();
finalimagePart=InlineDataPart('image/jpeg',image);
// Provide a text prompt instructing the model to edit the image
finalprompt=TextPart("Edit this image to make it look like a cartoon");
// To edit the image, call `generateContent` with the image and text input
finalresponse=awaitmodel.generateContent([
Content.multi([prompt,imagePart])
]);
// Handle the generated image
if(response.inlineDataParts.isNotEmpty){
finalimageBytes=response.inlineDataParts[0].bytes;
// Process the image
}else{
// Handle the case where no images were generated
print('Error: No images were generated.');
}

Unity


usingFirebase;
usingFirebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:newGenerationConfig(
responseModalities:new[]{ResponseModality.Text,ResponseModality.Image})
);
// Prepare an image for the model to edit
varimageFile=System.IO.File.ReadAllBytes(System.IO.Path.Combine(
UnityEngine.Application.streamingAssetsPath,"scones.jpg"));
varimage=ModelContent.InlineData("image/jpeg",imageFile);
// Provide a text prompt instructing the model to edit the image
varprompt=ModelContent.Text("Edit this image to make it look like a cartoon.");
// To edit the image, call `GenerateContent` with the image and text input
varresponse=awaitmodel.GenerateContentAsync(new[]{prompt,image});
vartext=response.Text;
if(!string.IsNullOrWhiteSpace(text)){
// Do something with the text
}
// Handle the generated image
varimageParts=response.Candidates.First().Content.Parts
.OfType<ModelContent.InlineDataPart>()
.Where(part=>part.MimeType=="image/png");
foreach(varimagePartinimageParts){
// Load the Image into a Unity Texture2D object
Texture2Dtexture2D=newTexture2D(2,2);
if(texture2D.LoadImage(imagePart.Data.ToArray())){
// Do something with the image
}
}

Iterate and edit images using multi-turn chat

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

Using multi-turn chat, you can iterate with a Gemini model on the images that it generates or that you supply.

Make sure to create a GenerativeModel instance, include responseModalities: ["TEXT", "IMAGE"] in your model configuration, and call startChat() and sendMessage() to send new user messages.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
letgenerativeModel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[.text,.image])
)
// Initialize the chat
letchat=model.startChat()
guardletimage=UIImage(named:"scones")else{fatalError("Image file not found.")}
// Provide an initial text prompt instructing the model to edit the image
letprompt="Edit this image to make it look like a cartoon"
// To generate an initial response, send a user message with the image and text prompt
letresponse=tryawaitchat.sendMessage(image,prompt)
// Inspect the generated image
guardletinlineDataPart=response.inlineDataParts.firstelse{
fatalError("No image data in response.")
}
guardletuiImage=UIImage(data:inlineDataPart.data)else{
fatalError("Failed to convert data to UIImage.")
}
// Follow up requests do not need to specify the image again
letfollowUpResponse=tryawaitchat.sendMessage("But make it old-school line drawing style")
// Inspect the edited image after the follow up request
guardletfollowUpInlineDataPart=followUpResponse.inlineDataParts.firstelse{
fatalError("No image data in response.")
}
guardletfollowUpUIImage=UIImage(data:followUpInlineDataPart.data)else{
fatalError("Failed to convert data to UIImage.")
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(
modelName="gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig=generationConfig{
responseModalities=listOf(ResponseModality.TEXT,ResponseModality.IMAGE)}
)
// Provide an image for the model to edit
valbitmap=BitmapFactory.decodeResource(context.resources,R.drawable.scones)
// Create the initial prompt instructing the model to edit the image
valprompt=content{
image(bitmap)
text("Edit this image to make it look like a cartoon")
}
// Initialize the chat
valchat=model.startChat()
// To generate an initial response, send a user message with the image and text prompt
varresponse=chat.sendMessage(prompt)
// Inspect the returned image
vargeneratedImageAsBitmap=response
.candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image
// Follow up requests do not need to specify the image again
response=chat.sendMessage("But make it old-school line drawing style")
generatedImageAsBitmap=response
.candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
newGenerationConfig.Builder()
.setResponseModalities(Arrays.asList(ResponseModality.TEXT,ResponseModality.IMAGE))
.build()
);
GenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);
// Provide an image for the model to edit
Bitmapbitmap=BitmapFactory.decodeResource(resources,R.drawable.scones);
// Initialize the chat
ChatFutureschat=model.startChat();
// Create the initial prompt instructing the model to edit the image
Contentprompt=newContent.Builder()
.setRole("user")
.addImage(bitmap)
.addText("Edit this image to make it look like a cartoon")
.build();
// To generate an initial response, send a user message with the image and text prompt
ListenableFuture<GenerateContentResponse>response=chat.sendMessage(prompt);
// Extract the image from the initial response
ListenableFuture<@NullableBitmap>initialRequest=Futures.transform(response,result->{
for(Partpart:result.getCandidates().get(0).getContent().getParts()){
if(partinstanceofImagePart){
ImagePartimagePart=(ImagePart)part;
returnimagePart.getImage();
}
}
returnnull;
},executor);
// Follow up requests do not need to specify the image again
ListenableFuture<GenerateContentResponse>modelResponseFuture=Futures.transformAsync(
initialRequest,
generatedImage->{
ContentfollowUpPrompt=newContent.Builder()
.addText("But make it old-school line drawing style")
.build();
returnchat.sendMessage(followUpPrompt);
},
executor);
// Add a final callback to check the reworked image
Futures.addCallback(modelResponseFuture,newFutureCallback<GenerateContentResponse>(){
@Override
publicvoidonSuccess(GenerateContentResponseresult){
for(Partpart:result.getCandidates().get(0).getContent().getParts()){
if(partinstanceofImagePart){
ImagePartimagePart=(ImagePart)part;
BitmapgeneratedImageAsBitmap=imagePart.getImage();
break;
}
}
}
@Override
publicvoidonFailure(Throwablet){
t.printStackTrace();
}
},executor);

Web


import{initializeApp}from"firebase/app";
import{getAI,getGenerativeModel,GoogleAIBackend,ResponseModality}from"firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
constfirebaseConfig={
// ...
};
// Initialize FirebaseApp
constfirebaseApp=initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `GenerativeModel` instance with a model that supports your use case
constmodel=getGenerativeModel(ai,{
model:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:{
responseModalities:[ResponseModality.TEXT,ResponseModality.IMAGE],
},
});
// Prepare an image for the model to edit
asyncfunctionfileToGenerativePart(file){
constbase64EncodedDataPromise=newPromise((resolve)=>{
constreader=newFileReader();
reader.onloadend=()=>resolve(reader.result.split(',')[1]);
reader.readAsDataURL(file);
});
return{
inlineData:{data:awaitbase64EncodedDataPromise,mimeType:file.type},
};
}
constfileInputEl=document.querySelector("input[type=file]");
constimagePart=awaitfileToGenerativePart(fileInputEl.files[0]);
// Provide an initial text prompt instructing the model to edit the image
constprompt="Edit this image to make it look like a cartoon";
// Initialize the chat
constchat=model.startChat();
// To generate an initial response, send a user message with the image and text prompt
constresult=awaitchat.sendMessage([prompt,imagePart]);
// Request and inspect the generated image
try{
constinlineDataParts=result.response.inlineDataParts();
if(inlineDataParts?.[0]){
// Inspect the generated image
constimage=inlineDataParts[0].inlineData;
console.log(image.mimeType,image.data);
}
}catch(err){
console.error('Prompt or candidate was blocked:',err);
}
// Follow up requests do not need to specify the image again
constfollowUpResult=awaitchat.sendMessage("But make it old-school line drawing style");
// Request and inspect the returned image
try{
constfollowUpInlineDataParts=followUpResult.response.inlineDataParts();
if(followUpInlineDataParts?.[0]){
// Inspect the generated image
constfollowUpImage=followUpInlineDataParts[0].inlineData;
console.log(followUpImage.mimeType,followUpImage.data);
}
}catch(err){
console.error('Prompt or candidate was blocked:',err);
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
finalmodel=FirebaseAI.googleAI().generativeModel(
model:'gemini-2.5-flash-image',
// Configure the model to respond with text and images (required)
generationConfig:GenerationConfig(responseModalities:[ResponseModalities.text,ResponseModalities.image]),
);
// Prepare an image for the model to edit
finalimage=awaitFile('scones.jpg').readAsBytes();
finalimagePart=InlineDataPart('image/jpeg',image);
// Provide an initial text prompt instructing the model to edit the image
finalprompt=TextPart("Edit this image to make it look like a cartoon");
// Initialize the chat
finalchat=model.startChat();
// To generate an initial response, send a user message with the image and text prompt
finalresponse=awaitchat.sendMessage([
Content.multi([prompt,imagePart])
]);
// Inspect the returned image
if(response.inlineDataParts.isNotEmpty){
finalimageBytes=response.inlineDataParts[0].bytes;
// Process the image
}else{
// Handle the case where no images were generated
print('Error: No images were generated.');
}
// Follow up requests do not need to specify the image again
finalfollowUpResponse=awaitchat.sendMessage([
Content.text("But make it old-school line drawing style")
]);
// Inspect the returned image
if(followUpResponse.inlineDataParts.isNotEmpty){
finalfollowUpImageBytes=response.inlineDataParts[0].bytes;
// Process the image
}else{
// Handle the case where no images were generated
print('Error: No images were generated.');
}

Unity


usingFirebase;
usingFirebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
modelName:"gemini-2.5-flash-image",
// Configure the model to respond with text and images (required)
generationConfig:newGenerationConfig(
responseModalities:new[]{ResponseModality.Text,ResponseModality.Image})
);
// Prepare an image for the model to edit
varimageFile=System.IO.File.ReadAllBytes(System.IO.Path.Combine(
UnityEngine.Application.streamingAssetsPath,"scones.jpg"));
varimage=ModelContent.InlineData("image/jpeg",imageFile);
// Provide an initial text prompt instructing the model to edit the image
varprompt=ModelContent.Text("Edit this image to make it look like a cartoon.");
// Initialize the chat
varchat=model.StartChat();
// To generate an initial response, send a user message with the image and text prompt
varresponse=awaitchat.SendMessageAsync(new[]{prompt,image});
// Inspect the returned image
varimageParts=response.Candidates.First().Content.Parts
.OfType<ModelContent.InlineDataPart>()
.Where(part=>part.MimeType=="image/png");
// Load the image into a Unity Texture2D object
UnityEngine.Texture2Dtexture2D=new(2,2);
if(texture2D.LoadImage(imageParts.First().Data.ToArray())){
// Do something with the image
}
// Follow up requests do not need to specify the image again
varfollowUpResponse=awaitchat.SendMessageAsync("But make it old-school line drawing style");
// Inspect the returned image
varfollowUpImageParts=followUpResponse.Candidates.First().Content.Parts
.OfType<ModelContent.InlineDataPart>()
.Where(part=>part.MimeType=="image/png");
// Load the image into a Unity Texture2D object
UnityEngine.Texture2DfollowUpTexture2D=new(2,2);
if(followUpTexture2D.LoadImage(followUpImageParts.First().Data.ToArray())){
// Do something with the image
}



Supported features, limitations, and best practices

Supported modalities and capabilities

The following are supported modalities and capabilities for image-output from a Gemini model. Each capability shows an example prompt and has an example code sample above.

  • Text Image(s) (text-only to image)

    • Generate an image of the Eiffel tower with fireworks in the background.
  • Text Image(s) (text rendering within image)

    • Generate a cinematic photo of a large building with this giant text projection mapped on the front of the building.
  • Text Image(s) & Text (interleaved)

    • Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe.

    • Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image.

  • Image(s) & Text Image(s) & Text (interleaved)

    • [image of a furnished room] + What other color sofas would work in my space? Can you update the image?
  • Image editing (text-and-image to image)

    • [image of scones] + Edit this image to make it look like a cartoon

    • [image of a cat] + [image of a pillow] + Create a cross stitch of my cat on this pillow.

  • Multi-turn image editing (chat)

    • [image of a blue car] + Turn this car into a convertible., then Now change the color to yellow.

Limitations and best practices

The following are limitations and best practices for image-output from a Gemini model.

  • Image-generating Gemini models support the following:

    • Generating PNG images with a maximum dimension of 1024 px.
    • Generating and editing images of people.
    • Using safety filters that provide a flexible and less restrictive user experience.
  • Image-generating Gemini models do not support the following:

    • Including audio or video inputs.
    • Generating only images.
      The models will always return both text and images, and you must include responseModalities: ["TEXT", "IMAGE"] in your model configuration.
  • For best performance, use the following languages: en, es-mx, ja-jp, zh-cn, hi-in.

  • Image generation may not always trigger. Here are some known issues:

    • The model may output text only.
      Try asking for image outputs explicitly (for example, "generate an image", "provide images as you go along", "update the image").

    • The model may stop generating partway through.
      Try again or try a different prompt.

    • The model may generate text as an image.
      Try asking for text outputs explicitly. For example, "generate narrative text along with illustrations."

  • When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月03日 UTC.