All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite). Learn more.

Bidirectional streaming using the Gemini Live API


The Gemini Live API enables low-latency bidirectional text and voice interactions with Gemini. Using the Live API, you can provide end users with the experience of natural, human-like voice conversations, with the ability to interrupt the model's responses using text or voice commands. The model can process text and audio input (video coming soon!), and it can provide text and audio output.

You can prototype with prompts and the Live API in Google AI Studio or Vertex AI Studio.

The Live API is a stateful API that creates a WebSocket connection to establish a session between the client and the Gemini server. For details, see the Live API reference documentation (Gemini Developer API | Vertex AI Gemini API).

Before you begin

Click your Gemini API provider to view provider-specific content and code on this page.

If you haven't already, complete the getting started guide, which describes how to set up your Firebase project, connect your app to Firebase, add the SDK, initialize the backend service for your chosen Gemini API provider, and create a LiveModel instance.

Models that support this capability

The models that support the Live API depend on your chosen Gemini API provider.

  • Gemini Developer API

    • gemini-live-2.5-flash (private GA*)
    • gemini-live-2.5-flash-preview
    • gemini-2.0-flash-live-001
    • gemini-2.0-flash-live-preview-04-09
  • Vertex AI Gemini API

    • gemini-live-2.5-flash (private GA*)
    • gemini-2.0-flash-live-preview-04-09 (only available to access in us-central1)

Take note that for the 2.5 model names for the Live API, the live segment immediately follows the gemini segment.

* Reach out to your Google Cloud account team representative to request access.

Use the standard features of the Live API

This section describes how to use the standard features of the Live API, specifically to stream various types of inputs and outputs:

Generate streamed text from streamed text input

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can send streamed text input and receive streamed text output. Make sure to create a liveModel instance and set the response modality to Text.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
letmodel=FirebaseAI.firebaseAI(backend:.googleAI()).liveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig:LiveGenerationConfig(
responseModalities:[.text]
)
)
do{
letsession=tryawaitmodel.connect()
// Provide a text prompt
lettext="tell a short story"
awaitsession.sendTextRealtime(text)
varoutputText=""
fortryawaitmessageinsession.responses{
ifcaselet.content(content)=message.payload{
content.modelTurn?.parts.forEach{partin
ifletpart=partas?TextPart{
outputText+=part.text
}
}
// Optional: if you don't require to send more requests.
ifcontent.isTurnComplete{
awaitsession.close()
}
}
}
// Output received from the server.
print(outputText)
}catch{
fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).liveModel(
modelName="gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig=liveGenerationConfig{
responseModality=ResponseModality.TEXT
}
)
valsession=model.connect()
// Provide a text prompt
valtext="tell a short story"
session.send(text)
varoutputText=""
session.receive().collect{
if(it.turnComplete){
// Optional: if you don't require to send more requests.
session.stopReceiving();
}
outputText=outputText+it.text
}
// Output received from the server.
println(outputText)

Java


ExecutorServiceexecutor=Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModellm=FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
newLiveGenerationConfig.Builder()
.setResponseModalities(ResponseModality.TEXT)
.build()
);
LiveModelFuturesmodel=LiveModelFutures.from(lm);
ListenableFuture<LiveSession>sessionFuture=model.connect();
class LiveContentResponseSubscriberimplementsSubscriber<LiveContentResponse>{
@Override
publicvoidonSubscribe(Subscriptions){
s.request(Long.MAX_VALUE);// Request an unlimited number of items
}
@Override
publicvoidonNext(LiveContentResponseliveContentResponse){
// Handle the response from the server.
System.out.println(liveContentResponse.getText());
}
@Override
publicvoidonError(Throwablet){
System.err.println("Error: "+t.getMessage());
}
@Override
publicvoidonComplete(){
System.out.println("Done receiving messages!");
}
}
Futures.addCallback(sessionFuture,newFutureCallback<LiveSession>(){
@Override
publicvoidonSuccess(LiveSessionses){
LiveSessionFuturessession=LiveSessionFutures.from(ses);
// Provide a text prompt
Stringtext="tell me a short story?";
session.send(text);
Publisher<LiveContentResponse>publisher=session.receive();
publisher.subscribe(newLiveContentResponseSubscriber());
}
@Override
publicvoidonFailure(Throwablet){
// Handle exceptions
}
},executor);

Web


// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
constmodel=getLiveGenerativeModel(ai,{
model:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig:{
responseModalities:[ResponseModality.TEXT],
},
});
constsession=awaitmodel.connect();
// Provide a text prompt
constprompt="tell a short story";
session.send(prompt);
// Collect text from model's turn
lettext="";
constmessages=session.receive();
forawait(constmessageofmessages){
switch(message.type){
case"serverContent":
if(message.turnComplete){
console.log(text);
}else{
constparts=message.modelTurn?.parts;
if(parts){
text+=parts.map((part)=>part.text).join("");
}
}
break;
case"toolCall":
// Ignore
case"toolCallCancellation":
// Ignore
}
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
lateLiveModelSession_session;
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
finalmodel=FirebaseAI.googleAI().liveGenerativeModel(
model:'gemini-2.0-flash-live-preview-04-09',
// Configure the model to respond with text
liveGenerationConfig:LiveGenerationConfig(responseModalities:[ResponseModalities.text]),
);
_session=awaitmodel.connect();
// Provide a text prompt
finalprompt=Content.text('tell a short story');
await_session.send(input:prompt,turnComplete:true);
// In a separate thread, receive the response
awaitfor(finalmessagein_session.receive()){
// Process the received message
}

Unity


usingFirebase;
usingFirebase.AI;
asyncTaskSendTextReceiveText(){
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
liveGenerationConfig:newLiveGenerationConfig(
responseModalities:new[]{ResponseModality.Text})
);
LiveSessionsession=awaitmodel.ConnectAsync();
// Provide a text prompt
varprompt=ModelContent.Text("tell a short story");
awaitsession.SendAsync(content:prompt,turnComplete:true);
// Receive the response
awaitforeach(varmessageinsession.ReceiveAsync()){
// Process the received message
if(!string.IsNullOrEmpty(message.Text)){
UnityEngine.Debug.Log("Received message: "+message.Text);
}
}
}

Generate streamed audio from streamed audio input

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can send streamed audio input and receive streamed audio output. Make sure to create a LiveModel instance and set the response modality to Audio.

Learn how to configure and customize the response voice (later on this page).

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
letmodel=FirebaseAI.firebaseAI(backend:.googleAI()).liveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
generationConfig:LiveGenerationConfig(
responseModalities:[.audio]
)
)
do{
letsession=tryawaitmodel.connect()
// Load the audio file, or tap a microphone
guardletaudioFile=NSDataAsset(name:"audio.pcm")else{
fatalError("Failed to load audio file")
}
// Provide the audio data
awaitsession.sendAudioRealtime(audioFile.data)
varoutputText=""
fortryawaitmessageinsession.responses{
ifcaselet.content(content)=message.payload{
content.modelTurn?.parts.forEach{partin
ifletpart=partas?InlineDataPart,part.mimeType.starts(with:"audio/pcm"){
// Handle 16bit pcm audio data at 24khz
playAudio(part.data)
}
}
// Optional: if you don't require to send more requests.
ifcontent.isTurnComplete{
awaitsession.close()
}
}
}
}catch{
fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).liveModel(
modelName="gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig=liveGenerationConfig{
responseModality=ResponseModality.AUDIO
}
)
valsession=model.connect()
// This is the recommended way.
// However, you can create your own recorder and handle the stream.
session.startAudioConversation()

Java


ExecutorServiceexecutor=Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModellm=FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
newLiveGenerationConfig.Builder()
.setResponseModalities(ResponseModality.TEXT)
.build()
);
LiveModelFuturesmodel=LiveModelFutures.from(lm);
ListenableFuture<LiveSession>sessionFuture=model.connect();
Futures.addCallback(sessionFuture,newFutureCallback<LiveSession>(){
@Override
publicvoidonSuccess(LiveSessionses){
LiveSessionFuturessession=LiveSessionFutures.from(ses);
session.startAudioConversation();
}
@Override
publicvoidonFailure(Throwablet){
// Handle exceptions
}
},executor);

Web


// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
constmodel=getLiveGenerativeModel(ai,{
model:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
generationConfig:{
responseModalities:[ResponseModality.AUDIO],
},
});
constsession=awaitmodel.connect();
// Start the audio conversation
constaudioConversationController=awaitstartAudioConversation(session);
// ... Later, to stop the audio conversation
// await audioConversationController.stop()

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
import'package:your_audio_recorder_package/your_audio_recorder_package.dart';
lateLiveModelSession_session;
final_audioRecorder=YourAudioRecorder();
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
finalmodel=FirebaseAI.googleAI().liveGenerativeModel(
model:'gemini-2.0-flash-live-preview-04-09',
// Configure the model to respond with audio
liveGenerationConfig:LiveGenerationConfig(responseModalities:[ResponseModalities.audio]),
);
_session=awaitmodel.connect();
finalaudioRecordStream=_audioRecorder.startRecordingStream();
// Map the Uint8List stream to InlineDataPart stream
finalmediaChunkStream=audioRecordStream.map((data){
returnInlineDataPart('audio/pcm',data);
});
await_session.startMediaStream(mediaChunkStream);
// In a separate thread, receive the audio response from the model
awaitfor(finalmessagein_session.receive()){
// Process the received message
}

Unity


usingFirebase;
usingFirebase.AI;
asyncTaskSendTextReceiveAudio(){
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
liveGenerationConfig:newLiveGenerationConfig(
responseModalities:new[]{ResponseModality.Audio})
);
LiveSessionsession=awaitmodel.ConnectAsync();
// Start a coroutine to send audio from the Microphone
varrecordingCoroutine=StartCoroutine(SendAudio(session));
// Start receiving the response
awaitReceiveAudio(session);
}
IEnumeratorSendAudio(LiveSessionliveSession){
stringmicrophoneDeviceName=null;
intrecordingFrequency=16000;
intrecordingBufferSeconds=2;
varrecordingClip=Microphone.Start(microphoneDeviceName,true,
recordingBufferSeconds,recordingFrequency);
intlastSamplePosition=0;
while(true){
if(!Microphone.IsRecording(microphoneDeviceName)){
yieldbreak;
}
intcurrentSamplePosition=Microphone.GetPosition(microphoneDeviceName);
if(currentSamplePosition!=lastSamplePosition){
// The Microphone uses a circular buffer, so we need to check if the
// current position wrapped around to the beginning, and handle it
// accordingly.
intsampleCount;
if(currentSamplePosition>lastSamplePosition){
sampleCount=currentSamplePosition-lastSamplePosition;
}else{
sampleCount=recordingClip.samples-lastSamplePosition+currentSamplePosition;
}
if(sampleCount>0){
// Get the audio chunk
float[]samples=newfloat[sampleCount];
recordingClip.GetData(samples,lastSamplePosition);
// Send the data, discarding the resulting Task to avoid the warning
_=liveSession.SendAudioAsync(samples);
lastSamplePosition=currentSamplePosition;
}
}
// Wait for a short delay before reading the next sample from the Microphone
constfloatMicrophoneReadDelay=0.5f;
yieldreturnnewWaitForSeconds(MicrophoneReadDelay);
}
}
QueueaudioBuffer=new();
asyncTaskReceiveAudio(LiveSessionliveSession){
intsampleRate=24000;
intchannelCount=1;
// Create a looping AudioClip to fill with the received audio data
intbufferSamples=(int)(sampleRate*channelCount);
AudioClipclip=AudioClip.Create("StreamingPCM",bufferSamples,channelCount,
sampleRate,true,OnAudioRead);
// Attach the clip to an AudioSource and start playing it
AudioSourceaudioSource=GetComponent();
audioSource.clip=clip;
audioSource.loop=true;
audioSource.Play();
// Start receiving the response
awaitforeach(varmessageinliveSession.ReceiveAsync()){
// Process the received message
foreach(float[]pcmDatainmessage.AudioAsFloat){
lock(audioBuffer){
foreach(floatsampleinpcmData){
audioBuffer.Enqueue(sample);
}
}
}
}
}
// This method is called by the AudioClip to load audio data.
privatevoidOnAudioRead(float[]data){
intsamplesToProvide=data.Length;
intsamplesProvided=0;
lock(audioBuffer){
while(samplesProvided<samplesToProvide&&audioBuffer.Count>0){
data[samplesProvided]=audioBuffer.Dequeue();
samplesProvided++;
}
}
while(samplesProvided<samplesToProvide){
data[samplesProvided]=0.0f;
samplesProvided++;
}
}

Generate streamed text from streamed audio input

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can send streamed audio input and receive streamed text output. Make sure to create a LiveModel instance and set the response modality to Text.

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
letmodel=FirebaseAI.firebaseAI(backend:.googleAI()).liveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig:LiveGenerationConfig(
responseModalities:[.text]
)
)
do{
letsession=tryawaitmodel.connect()
// Load the audio file, or tap a microphone
guardletaudioFile=NSDataAsset(name:"audio.pcm")else{
fatalError("Failed to load audio file")
}
// Provide the audio data
awaitsession.sendAudioRealtime(audioFile.data)
varoutputText=""
fortryawaitmessageinsession.responses{
ifcaselet.content(content)=message.payload{
content.modelTurn?.parts.forEach{partin
ifletpart=partas?TextPart{
outputText+=part.text
}
}
// Optional: if you don't require to send more requests.
ifcontent.isTurnComplete{
awaitsession.close()
}
}
}
// Output received from the server.
print(outputText)
}catch{
fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).liveModel(
modelName="gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig=liveGenerationConfig{
responseModality=ResponseModality.TEXT
}
)
valsession=model.connect()
// Provide a text prompt
valaudioContent=content("user"){audioData}
session.send(audioContent)
varoutputText=""
session.receive().collect{
if(it.status==Status.TURN_COMPLETE){
// Optional: if you don't require to send more requests.
session.stopReceiving();
}
outputText=outputText+it.text
}
// Output received from the server.
println(outputText)

Java


ExecutorServiceexecutor=Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModellm=FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
newLiveGenerationConfig.Builder()
.setResponseModalities(ResponseModality.TEXT)
.build()
);
LiveModelFuturesmodel=LiveModelFutures.from(lm);
ListenableFuture<LiveSession>sessionFuture=model.connect();
class LiveContentResponseSubscriberimplementsSubscriber<LiveContentResponse>{
@Override
publicvoidonSubscribe(Subscriptions){
s.request(Long.MAX_VALUE);// Request an unlimited number of items
}
@Override
publicvoidonNext(LiveContentResponseliveContentResponse){
// Handle the response from the server.
System.out.println(liveContentResponse.getText());
}
@Override
publicvoidonError(Throwablet){
System.err.println("Error: "+t.getMessage());
}
@Override
publicvoidonComplete(){
System.out.println("Done receiving messages!");
}
}
Futures.addCallback(sessionFuture,newFutureCallback<LiveSession>(){
@Override
publicvoidonSuccess(LiveSessionses){
LiveSessionFuturessession=LiveSessionFutures.from(ses);
// Send Audio data
session.send(newContent.Builder().addInlineData(audioData,"audio/pcm").build());
session.send(text);
Publisher<LiveContentResponse>publisher=session.receive();
publisher.subscribe(newLiveContentResponseSubscriber());
}
@Override
publicvoidonFailure(Throwablet){
// Handle exceptions
}
},executor);

Web


// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
constmodel=getLiveGenerativeModel(ai,{
model:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
generationConfig:{
responseModalities:[ResponseModality.TEXT],
},
});
constsession=awaitmodel.connect();
// TODO(developer): Collect audio data (16-bit 16kHz PCM)
// const audioData = ...
// Send audio
constaudioPart={
inlineData:{data:audioData,mimeType:"audio/pcm"},
};
session.send([audioPart]);
// Collect text from model's turn
lettext="";
constmessages=session.receive();
forawait(constmessageofmessages){
switch(message.type){
case"serverContent":
if(message.turnComplete){
console.log(text);
}else{
constparts=message.modelTurn?.parts;
if(parts){
text+=parts.map((part)=>part.text).join("");
}
}
break;
case"toolCall":
// Ignore
case"toolCallCancellation":
// Ignore
}
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
import'package:your_audio_recorder_package/your_audio_recorder_package.dart';
import'dart:async';
lateLiveModelSession_session;
final_audioRecorder=YourAudioRecorder();
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
finalmodel=FirebaseAI.googleAI().liveGenerativeModel(
model:'gemini-2.0-flash-live-preview-04-09',
// Configure the model to respond with text
liveGenerationConfig:LiveGenerationConfig(responseModalities:ResponseModalities.text),
);
_session=awaitmodel.connect();
finalaudioRecordStream=_audioRecorder.startRecordingStream();
finalmediaChunkStream=audioRecordStream.map((data){
returnInlineDataPart('audio/pcm',data);
});
await_session.startMediaStream(mediaChunkStream);
finalresponseStream=_session.receive();
returnresponseStream.asyncMap((response)async{
if(response.parts.isNotEmpty&&response.parts.first.text!=null){
returnresponse.parts.first.text!;
}else{
throwException('Text response not found.');
}
});
Futuremain()async{
try{
finaltextStream=awaitaudioToText();
awaitfor(finaltextintextStream){
print('Received text: $text');
// Handle the text response
}
}catch(e){
print('Error: $e');
}
}

Unity


usingFirebase;
usingFirebase.AI;
asyncTaskSendAudioReceiveText(){
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
liveGenerationConfig:newLiveGenerationConfig(
responseModalities:new[]{ResponseModality.Text})
);
LiveSessionsession=awaitmodel.ConnectAsync();
// Start a coroutine to send audio from the Microphone
varrecordingCoroutine=StartCoroutine(SendAudio(session));
// Receive the response
awaitforeach(varmessageinsession.ReceiveAsync()){
// Process the received message
if(!string.IsNullOrEmpty(message.Text)){
UnityEngine.Debug.Log("Received message: "+message.Text);
}
}
StopCoroutine(recordingCoroutine);
}
IEnumeratorSendAudio(LiveSessionliveSession){
stringmicrophoneDeviceName=null;
intrecordingFrequency=16000;
intrecordingBufferSeconds=2;
varrecordingClip=Microphone.Start(microphoneDeviceName,true,
recordingBufferSeconds,recordingFrequency);
intlastSamplePosition=0;
while(true){
if(!Microphone.IsRecording(microphoneDeviceName)){
yieldbreak;
}
intcurrentSamplePosition=Microphone.GetPosition(microphoneDeviceName);
if(currentSamplePosition!=lastSamplePosition){
// The Microphone uses a circular buffer, so we need to check if the
// current position wrapped around to the beginning, and handle it
// accordingly.
intsampleCount;
if(currentSamplePosition>lastSamplePosition){
sampleCount=currentSamplePosition-lastSamplePosition;
}else{
sampleCount=recordingClip.samples-lastSamplePosition+currentSamplePosition;
}
if(sampleCount>0){
// Get the audio chunk
float[]samples=newfloat[sampleCount];
recordingClip.GetData(samples,lastSamplePosition);
// Send the data, discarding the resulting Task to avoid the warning
_=liveSession.SendAudioAsync(samples);
lastSamplePosition=currentSamplePosition;
}
}
// Wait for a short delay before reading the next sample from the Microphone
constfloatMicrophoneReadDelay=0.5f;
yieldreturnnewWaitForSeconds(MicrophoneReadDelay);
}
}

Generate streamed audio from streamed text input

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

You can send streamed text input and receive streamed audio output. Make sure to create a LiveModel instance and set the response modality to Audio.

Learn how to configure and customize the response voice (later on this page).

Swift


importFirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
letmodel=FirebaseAI.firebaseAI(backend:.googleAI()).liveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
generationConfig:LiveGenerationConfig(
responseModalities:[.audio]
)
)
do{
letsession=tryawaitmodel.connect()
// Provide a text prompt
lettext="tell a short story"
awaitsession.sendTextRealtime(text)
varoutputText=""
fortryawaitmessageinsession.responses{
ifcaselet.content(content)=message.payload{
content.modelTurn?.parts.forEach{partin
ifletpart=partas?InlineDataPart,part.mimeType.starts(with:"audio/pcm"){
// Handle 16bit pcm audio data at 24khz
playAudio(part.data)
}
}
// Optional: if you don't require to send more requests.
ifcontent.isTurnComplete{
awaitsession.close()
}
}
}
}catch{
fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).liveModel(
modelName="gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
generationConfig=liveGenerationConfig{
responseModality=ResponseModality.AUDIO
}
)
valsession=model.connect()
// Provide a text prompt
valtext="tell a short story"
session.send(text)
session.receive().collect{
if(it.turnComplete){
// Optional: if you don't require to send more requests.
session.stopReceiving();
}
// Handle 16bit pcm audio data at 24khz
playAudio(it.data)
}

Java


ExecutorServiceexecutor=Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModellm=FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with text
newLiveGenerationConfig.Builder()
.setResponseModalities(ResponseModality.AUDIO)
.build()
);
LiveModelFuturesmodel=LiveModelFutures.from(lm);
ListenableFuture<LiveSession>sessionFuture=model.connect();
class LiveContentResponseSubscriberimplementsSubscriber<LiveContentResponse>{
@Override
publicvoidonSubscribe(Subscriptions){
s.request(Long.MAX_VALUE);// Request an unlimited number of items
}
@Override
publicvoidonNext(LiveContentResponseliveContentResponse){
// Handle 16bit pcm audio data at 24khz
liveContentResponse.getData();
}
@Override
publicvoidonError(Throwablet){
System.err.println("Error: "+t.getMessage());
}
@Override
publicvoidonComplete(){
System.out.println("Done receiving messages!");
}
}
Futures.addCallback(sessionFuture,newFutureCallback<LiveSession>(){
@Override
publicvoidonSuccess(LiveSessionses){
LiveSessionFuturessession=LiveSessionFutures.from(ses);
// Provide a text prompt
Stringtext="tell me a short story?";
session.send(text);
Publisher<LiveContentResponse>publisher=session.receive();
publisher.subscribe(newLiveContentResponseSubscriber());
}
@Override
publicvoidonFailure(Throwablet){
// Handle exceptions
}
},executor);

Web


// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
constmodel=getLiveGenerativeModel(ai,{
model:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
generationConfig:{
responseModalities:[ResponseModality.AUDIO],
},
});
constsession=awaitmodel.connect();
// Provide a text prompt
constprompt="tell a short story";
session.send(prompt);
// Handle the model's audio output
constmessages=session.receive();
forawait(constmessageofmessages){
switch(message.type){
case"serverContent":
if(message.turnComplete){
// TODO(developer): Handle turn completion
}elseif(message.interrupted){
// TODO(developer): Handle the interruption
break;
}elseif(message.modelTurn){
constparts=message.modelTurn?.parts;
parts?.forEach((part)=>{
if(part.inlineData){
// TODO(developer): Play the audio chunk
}
});
}
break;
case"toolCall":
// Ignore
case"toolCallCancellation":
// Ignore
}
}

Dart


import'package:firebase_ai/firebase_ai.dart';
import'package:firebase_core/firebase_core.dart';
import'firebase_options.dart';
import'dart:async';
import'dart:typed_data';
lateLiveModelSession_session;
Future<Stream<Uint8List>>textToAudio(StringtextPrompt)async{
WidgetsFlutterBinding.ensureInitialized();
awaitFirebase.initializeApp(
options:DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
finalmodel=FirebaseAI.googleAI().liveGenerativeModel(
model:'gemini-2.0-flash-live-preview-04-09',
// Configure the model to respond with audio
liveGenerationConfig:LiveGenerationConfig(responseModalities:ResponseModalities.audio),
);
_session=awaitmodel.connect();
finalprompt=Content.text(textPrompt);
await_session.send(input:prompt);
return_session.receive().asyncMap((response)async{
if(responseisLiveServerContent && response.modelTurn?.parts!=null){
for(finalpartinresponse.modelTurn!.parts){
if(partisInlineDataPart){
returnpart.bytes;
}
}
}
throwException('Audio data not found');
});
}
Future<void>main()async{
try{
finalaudioStream=awaittextToAudio('Convert this text to audio.');
awaitfor(finalaudioDatainaudioStream){
// Process the audio data (e.g., play it using an audio player package)
print('Received audio data: ${audioData.length} bytes');
// Example using flutter_sound (replace with your chosen package):
// await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);
}
}catch(e){
print('Error: $e');
}
}

Unity


usingFirebase;
usingFirebase.AI;
asyncTaskSendTextReceiveAudio(){
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to respond with audio
liveGenerationConfig:newLiveGenerationConfig(
responseModalities:new[]{ResponseModality.Audio})
);
LiveSessionsession=awaitmodel.ConnectAsync();
// Provide a text prompt
varprompt=ModelContent.Text("Convert this text to audio.");
awaitsession.SendAsync(content:prompt,turnComplete:true);
// Start receiving the response
awaitReceiveAudio(session);
}
Queue<float>audioBuffer=new();
asyncTaskReceiveAudio(LiveSessionsession){
intsampleRate=24000;
intchannelCount=1;
// Create a looping AudioClip to fill with the received audio data
intbufferSamples=(int)(sampleRate*channelCount);
AudioClipclip=AudioClip.Create("StreamingPCM",bufferSamples,channelCount,
sampleRate,true,OnAudioRead);
// Attach the clip to an AudioSource and start playing it
AudioSourceaudioSource=GetComponent<AudioSource>();
audioSource.clip=clip;
audioSource.loop=true;
audioSource.Play();
// Start receiving the response
awaitforeach(varmessageinsession.ReceiveAsync()){
// Process the received message
foreach(float[]pcmDatainmessage.AudioAsFloat){
lock(audioBuffer){
foreach(floatsampleinpcmData){
audioBuffer.Enqueue(sample);
}
}
}
}
}
// This method is called by the AudioClip to load audio data.
privatevoidOnAudioRead(float[]data){
intsamplesToProvide=data.Length;
intsamplesProvided=0;
lock(audioBuffer){
while(samplesProvided < samplesToProvide && audioBuffer.Count > 0){
data[samplesProvided]=audioBuffer.Dequeue();
samplesProvided++;
}
}
while(samplesProvided < samplesToProvide){
data[samplesProvided]=0.0f;
samplesProvided++;
}
}



Create more engaging and interactive experiences

This section describes how to create and manage more engaging or interactive features of the Live API.

Change the response voice

The Live API uses Chirp 3 to support synthesized speech responses. When using Firebase AI Logic, you can send audio in a variety of HD voices languages. For a full list and demos of what each voice sounds like, see Chirp 3: HD voices.

To specify a voice, set the voice name within the speechConfig object as part of the model configuration. If you don't specify a voice, the default is Puck.

Before trying this sample, complete the Before you begin section of this guide to set up your project and app.
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.

Swift


importFirebaseAI
// ...
letmodel=FirebaseAI.firebaseAI(backend:.googleAI()).liveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to use a specific voice for its audio response
generationConfig:LiveGenerationConfig(
responseModalities:[.audio],
speech:SpeechConfig(voiceName:"VOICE_NAME")
)
)
// ...

Kotlin


// ...
valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).liveModel(
modelName="gemini-2.0-flash-live-preview-04-09",
// Configure the model to use a specific voice for its audio response
generationConfig=liveGenerationConfig{
responseModality=ResponseModality.AUDIO
speechConfig=SpeechConfig(voice=Voice("VOICE_NAME"))
}
)
// ...

Java


// ...
LiveModelmodel=FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.0-flash-live-preview-04-09",
// Configure the model to use a specific voice for its audio response
newLiveGenerationConfig.Builder()
.setResponseModalities(ResponseModality.AUDIO)
.setSpeechConfig(newSpeechConfig(newVoice("VOICE_NAME")))
.build()
);
// ...

Web


// Initialize the Gemini Developer API backend service
constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
constmodel=getLiveGenerativeModel(ai,{
model:"gemini-2.0-flash-live-preview-04-09",
// Configure the model to use a specific voice for its audio response
generationConfig:{
responseModalities:[ResponseModality.AUDIO],
speechConfig:{
voiceConfig:{
prebuiltVoiceConfig:{voiceName:"VOICE_NAME"},
},
},
},
});

Dart


// ...
finalmodel=FirebaseAI.googleAI().liveGenerativeModel(
model:'gemini-2.0-flash-live-preview-04-09',
// Configure the model to use a specific voice for its audio response
liveGenerationConfig:LiveGenerationConfig(
responseModalities:ResponseModalities.audio,
speechConfig:SpeechConfig(voiceName:'VOICE_NAME'),
),
);
// ...

Unity


varmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName:"gemini-2.0-flash-live-preview-04-09",
liveGenerationConfig:newLiveGenerationConfig(
responseModalities:new[]{ResponseModality.Audio},
speechConfig:SpeechConfig.UsePrebuiltVoice("VOICE_NAME"))
);

For the best results when prompting and requiring the model to respond in a non-English language, include the following as part of your system instructions:

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

Maintain context across sessions and requests

You can use a chat structure to maintain context across sessions and requests. Note that this only works for text input and text output.

This approach is best for short contexts; you can send turn-by-turn interactions to represent the exact sequence of events . For longer contexts, we recommend providing a single message summary to free up the context window for subsequent interactions.

Handle interruptions

Firebase AI Logic does not yet support handling interruptions. Check back soon!

Use function calling (tools)

You can define tools, like available functions, to use with the Live API just like you can with the standard content generation methods. This section describes some nuances when using the Live API with function calling. For a complete description and examples for function calling, see the function calling guide.

From a single prompt, the model can generate multiple function calls and the code necessary to chain their outputs. This code executes in a sandbox environment, generating subsequent BidiGenerateContentToolCall messages. The execution pauses until the results of each function call are available, which ensures sequential processing.

Additionally, using the Live API with function calling is particularly powerful because the model can request follow-up or clarifying information from the user. For example, if the model doesn't have enough information to provide a parameter value to a function it wants to call, then the model can ask the user to provide more or clarifying information.

The client should respond with BidiGenerateContentToolResponse.



Limitations and requirements

Keep in mind the following limitations and requirements of the Live API.

Transcription

Firebase AI Logic does not yet support transcriptions. Check back soon!

Languages

Audio formats

The Live API supports the following audio formats:

  • Input audio format: Raw 16 bit PCM audio at 16kHz little-endian
  • Output audio format: Raw 16 bit PCM audio at 24kHz little-endian

Rate limits

The Live API has rate limits for both concurrent sessions per Firebase project as well as tokens per minute (TPM).

  • Gemini Developer API:

  • Vertex AI Gemini API:

    • 5,000 concurrent sessions per Firebase project
    • 4M tokens per minute

Session length

The default length for a session is 10 minutes. When the session duration exceeds the limit, the connection is terminated.

The model is also limited by the context size. Sending large chunks of input may result in earlier session termination.

Voice activity detection (VAD)

The model automatically performs voice activity detection (VAD) on a continuous audio input stream. VAD is enabled by default.

Token counting

You cannot use the CountTokens API with the Live API.


Give feedback about your experience with Firebase AI Logic


Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月09日 UTC.