Use the Count Tokens API
Stay organized with collections
Save and categorize content based on your preferences.
This page shows you how to get the token count for a prompt by using the
countTokens
API.
Supported models
The following multimodal models support getting an estimate of the prompt token count:
- Gemini 2.5 Flash (Preview)
- Gemini 2.5 Flash-Lite (Preview)
- Gemini 2.5 Flash Image
- Gemini 2.5 Flash-Lite
- Gemini 2.0 Flash with image generation (Preview)
- Gemini 2.5 Pro
- Gemini 2.5 Flash
- Gemini 2.0 Flash
- Gemini 2.0 Flash-Lite
To learn more about model versions, see Gemini model versions and lifecycle.
Get the token count for a prompt
You can get the token count estimate for a prompt by using the Vertex AI API.
Console
To get the token count for a prompt by using Vertex AI Studio in the Google Cloud console, perform the following steps:
- In the Vertex AI section of the Google Cloud console, go to the Vertex AI Studio page.
- Click either Open Freeform or Open Chat.
- The number of tokens is calculated and displayed as you type in the Prompt pane. It includes the number of tokens in any input files.
- To see more details, click <count> tokens to open the Prompt tokenizer.
- To view the tokens in the text prompt that are highlighted with different colors marking the boundary of each token ID, click Token ID to text. Media tokens aren't supported.
- To view the token IDs, click Token ID.
To close the tokenizer tool pane, click X, or click outside of the pane.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimport genai
fromgoogle.genai.typesimport HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.count_tokens(
model="gemini-2.5-flash",
contents="What's the highest mountain in Africa?",
)
print(response)
# Example output:
# total_tokens=9
# cached_content_token_count=None
Go
Learn how to install or update the Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
import(
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// countWithTxt shows how to count tokens with text input.
func countWithTxt(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
modelName := "gemini-2.5-flash"
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: "What's the highest mountain in Africa?"},
}},
}
resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
// Example response:
// Total: 9
// Cached: 0
return nil
}
Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
const{GoogleGenAI}=require('@google/genai');
constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;
constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';
asyncfunctioncountTokens(
projectId=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION
){
constclient=newGoogleGenAI({
vertexai:true,
project:projectId,
location:location,
});
constresponse=awaitclient.models.countTokens({
model:'gemini-2.5-flash',
contents:'What is the highest mountain in Africa?',
});
console.log(response);
returnresponse.totalTokens;
}
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
importcom.google.genai.Client;
importcom.google.genai.types.CountTokensResponse;
importcom.google.genai.types.HttpOptions;
importjava.util.Optional;
public classCountTokensWithText {
public static void main(String[] args) {
// TODO(developer): Replace these variables before running the sample.
String modelId = "gemini-2.5-flash";
countTokens(modelId);
}
// Counts tokens with text input
public static Optional<Integer> countTokens(String modelId) {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests.
try (Client client =
Client.builder()
.location("global")
.vertexAI(true)
.httpOptions(HttpOptions.builder().apiVersion("v1").build())
.build()) {
CountTokensResponse response =
client.models.countTokens(modelId, "What's the highest mountain in Africa?", null);
System.out.print(response);
// Example response:
// CountTokensResponse{totalTokens=Optional[9], cachedContentTokenCount=Optional.empty}
return response.totalTokens();
}
}
}
REST
To get the token count for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request. Available
options include the following:
Click to expand a partial list of available regions
us-central1
us-west4
northamerica-northeast1
us-east4
us-west1
asia-northeast3
asia-southeast1
asia-northeast1
- PROJECT_ID: Your project ID.
- MODEL_ID: The model ID of the multimodal model that you want to use.
- ROLE:
The role in a conversation associated with the content. Specifying a role is required even in
singleturn use cases.
Acceptable values include the following:
USER
: Specifies content that's sent by you.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens
Request JSON body:
{ "contents": [{ "role": "ROLE", "parts": [{ "text": "TEXT" }] }] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Response
{ "totalTokens": 31, "totalBillableCharacters": 96, "promptTokensDetails": [ { "modality": "TEXT", "tokenCount": 31 } ] }
Example for text with image or video:
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimport genai
fromgoogle.genai.typesimport HttpOptions, Part
client = genai.Client(http_options=HttpOptions(api_version="v1"))
contents = [
Part.from_uri(
file_uri="gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
mime_type="video/mp4",
),
"Provide a description of the video.",
]
response = client.models.count_tokens(
model="gemini-2.5-flash",
contents=contents,
)
print(response)
# Example output:
# total_tokens=16252 cached_content_token_count=None
Go
Learn how to install or update the Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
import(
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// countWithTxtAndVid shows how to count tokens with text and video inputs.
func countWithTxtAndVid(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
modelName := "gemini-2.5-flash"
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: "Provide a description of the video."},
{FileData: &genai.FileData{
FileURI: "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
MIMEType: "video/mp4",
}},
},
Role: "user"},
}
resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
// Example response:
// Total: 16252
// Cached: 0
return nil
}
Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
const{GoogleGenAI}=require('@google/genai');
constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;
constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';
asyncfunctioncountTokens(
projectId=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION
){
constclient=newGoogleGenAI({
vertexai:true,
project:projectId,
location:location,
});
constvideo={
fileData:{
fileUri:'gs://cloud-samples-data/generative-ai/video/pixel8.mp4',
mimeType:'video/mp4',
},
};
constresponse=awaitclient.models.countTokens({
model:'gemini-2.5-flash',
contents:[video,'Provide a description of the video.'],
});
console.log(response);
returnresponse.totalTokens;
}
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True
importcom.google.genai.Client;
importcom.google.genai.types.Content;
importcom.google.genai.types.CountTokensResponse;
importcom.google.genai.types.HttpOptions;
importcom.google.genai.types.Part;
importjava.util.List;
importjava.util.Optional;
public classCountTokensWithTextAndVideo {
public static void main(String[] args) {
// TODO(developer): Replace these variables before running the sample.
String modelId = "gemini-2.5-flash";
countTokens(modelId);
}
// Counts tokens with text and video inputs
public static Optional<Integer> countTokens(String modelId) {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests.
try (Client client =
Client.builder()
.location("global")
.vertexAI(true)
.httpOptions(HttpOptions.builder().apiVersion("v1").build())
.build()) {
Content content =
Content.fromParts(
Part.fromText("Provide a description of this video"),
Part.fromUri("gs://cloud-samples-data/generative-ai/video/pixel8.mp4", "video/mp4"));
CountTokensResponse response = client.models.countTokens(modelId, List.of(content), null);
System.out.print(response);
// Example response:
// CountTokensResponse{totalTokens=Optional[16707], cachedContentTokenCount=Optional.empty}
return response.totalTokens();
}
}
}
REST
To get the token count for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
MODEL_ID="gemini-2.5-flash" PROJECT_ID="my-project" TEXT="Provide a summary with about two sentences for the following article." REGION="us-central1" curl\ -XPOST\ -H"Authorization: Bearer $(gcloudauthprint-access-token)"\ -H"Content-Type: application/json"\ https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens-d\ $'{ "contents": [{ "role": "user", "parts": [ { "file_data": { "file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4", "mime_type": "video/mp4" } }, { "text": "'"$TEXT"'" }] }] }'
Pricing and quota
There is no charge or quota restriction for using the CountTokens
API. The
maximum quota for the CountTokens
API is 3000 requests per minute.
What's next
- Learn how to use use Vertex AI SDK for Python to list and count tokens (Preview)
- Learn about sending chat prompts and text generation