Use the Count Tokens API

This page shows you how to get the token count for a prompt by using the countTokens API.

Supported models

The following multimodal models support getting an estimate of the prompt token count:

Gemini 2.5 Flash (Preview)
Gemini 2.5 Flash-Lite (Preview)
Gemini 2.5 Flash Image
Gemini 2.5 Flash-Lite
Gemini 2.0 Flash with image generation (Preview)
Gemini 2.5 Pro
Gemini 2.5 Flash
Gemini 2.0 Flash
Gemini 2.0 Flash-Lite

To learn more about model versions, see Gemini model versions and lifecycle.

Get the token count for a prompt

You can get the token count estimate for a prompt by using the Vertex AI API.

Console

To get the token count for a prompt by using Vertex AI Studio in the Google Cloud console, perform the following steps:

In the Vertex AI section of the Google Cloud console, go to the Vertex AI Studio page.
Go to Vertex AI Studio
Click either Open Freeform or Open Chat.
The number of tokens is calculated and displayed as you type in the Prompt pane. It includes the number of tokens in any input files.
To see more details, click <count> tokens to open the Prompt tokenizer.

To view the tokens in the text prompt that are highlighted with different colors marking the boundary of each token ID, click Token ID to text. Media tokens aren't supported.
To view the token IDs, click Token ID.
To close the tokenizer tool pane, click X, or click outside of the pane.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

fromgoogleimport genai
fromgoogle.genai.typesimport HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.count_tokens(
 model="gemini-2.5-flash",
 contents="What's the highest mountain in Africa?",
)
print(response)
# Example output:
# total_tokens=9
# cached_content_token_count=None

Go

Learn how to install or update the Go.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

import(
	"context"
	"fmt"
	"io"
	genai "google.golang.org/genai"
)
// countWithTxt shows how to count tokens with text input.
func countWithTxt(w io.Writer) error {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}
	modelName := "gemini-2.5-flash"
	contents := []*genai.Content{
		{Parts: []*genai.Part{
			{Text: "What's the highest mountain in Africa?"},
		}},
	}
	resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}
	fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
	// Example response:
	// Total: 9
	// Cached: 0
	return nil
}

Node.js

Install

npm install @google/genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

const{GoogleGenAI}=require('@google/genai');
constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;
constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';
asyncfunctioncountTokens(
projectId=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION
){
constclient=newGoogleGenAI({
vertexai:true,
project:projectId,
location:location,
});
constresponse=awaitclient.models.countTokens({
model:'gemini-2.5-flash',
contents:'What is the highest mountain in Africa?',
});
console.log(response);
returnresponse.totalTokens;
}

Java

Learn how to install or update the Java.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True


importcom.google.genai.Client;
importcom.google.genai.types.CountTokensResponse;
importcom.google.genai.types.HttpOptions;
importjava.util.Optional;
public classCountTokensWithText {
 public static void main(String[] args) {
 // TODO(developer): Replace these variables before running the sample.
 String modelId = "gemini-2.5-flash";
 countTokens(modelId);
 }
 // Counts tokens with text input
 public static Optional<Integer> countTokens(String modelId) {
 // Initialize client that will be used to send requests. This client only needs to be created
 // once, and can be reused for multiple requests.
 try (Client client =
 Client.builder()
 .location("global")
 .vertexAI(true)
 .httpOptions(HttpOptions.builder().apiVersion("v1").build())
 .build()) {
 CountTokensResponse response =
 client.models.countTokens(modelId, "What's the highest mountain in Africa?", null);
 System.out.print(response);
 // Example response:
 // CountTokensResponse{totalTokens=Optional[9], cachedContentTokenCount=Optional.empty}
 return response.totalTokens();
 }
 }
}

REST

To get the token count for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

LOCATION: The region to process the request. Available options include the following:
Click to expand a partial list of available regions
- us-central1
- us-west4
- northamerica-northeast1
- us-east4
- us-west1
- asia-northeast3
- asia-southeast1
- asia-northeast1
PROJECT_ID: Your project ID.
MODEL_ID: The model ID of the multimodal model that you want to use.
ROLE: The role in a conversation associated with the content. Specifying a role is required even in singleturn use cases. Acceptable values include the following:
- USER: Specifies content that's sent by you.
TEXT: The text instructions to include in the prompt.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens

Request JSON body:

{
 "contents": [{
 "role": "ROLE",
 "parts": [{
 "text": "TEXT"
 }]
 }]
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
 -H "Authorization: Bearer $(gcloud auth print-access-token)" \
 -H "Content-Type: application/json; charset=utf-8" \
 -d @request.json \
 "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
 -Method POST `
 -Headers $headers `
 -ContentType: "application/json; charset=utf-8" `
 -InFile request.json `
 -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Response

{
 "totalTokens": 31,
 "totalBillableCharacters": 96,
 "promptTokensDetails": [
 {
 "modality": "TEXT",
 "tokenCount": 31
 }
 ]
}

Example for text with image or video:

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

fromgoogleimport genai
fromgoogle.genai.typesimport HttpOptions, Part
client = genai.Client(http_options=HttpOptions(api_version="v1"))
contents = [
 Part.from_uri(
 file_uri="gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
 mime_type="video/mp4",
 ),
 "Provide a description of the video.",
]
response = client.models.count_tokens(
 model="gemini-2.5-flash",
 contents=contents,
)
print(response)
# Example output:
# total_tokens=16252 cached_content_token_count=None

Go

Learn how to install or update the Go.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

import(
	"context"
	"fmt"
	"io"
	genai "google.golang.org/genai"
)
// countWithTxtAndVid shows how to count tokens with text and video inputs.
func countWithTxtAndVid(w io.Writer) error {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}
	modelName := "gemini-2.5-flash"
	contents := []*genai.Content{
		{Parts: []*genai.Part{
			{Text: "Provide a description of the video."},
			{FileData: &genai.FileData{
				FileURI: "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
				MIMEType: "video/mp4",
			}},
		},
			Role: "user"},
	}
	resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}
	fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
	// Example response:
	// Total: 16252
	// Cached: 0
	return nil
}

Node.js

Install

npm install @google/genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True

const{GoogleGenAI}=require('@google/genai');
constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;
constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';
asyncfunctioncountTokens(
projectId=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION
){
constclient=newGoogleGenAI({
vertexai:true,
project:projectId,
location:location,
});
constvideo={
fileData:{
fileUri:'gs://cloud-samples-data/generative-ai/video/pixel8.mp4',
mimeType:'video/mp4',
},
};
constresponse=awaitclient.models.countTokens({
model:'gemini-2.5-flash',
contents:[video,'Provide a description of the video.'],
});
console.log(response);
returnresponse.totalTokens;
}

Java

Learn how to install or update the Java.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
exportGOOGLE_CLOUD_LOCATION=global
exportGOOGLE_GENAI_USE_VERTEXAI=True


importcom.google.genai.Client;
importcom.google.genai.types.Content;
importcom.google.genai.types.CountTokensResponse;
importcom.google.genai.types.HttpOptions;
importcom.google.genai.types.Part;
importjava.util.List;
importjava.util.Optional;
public classCountTokensWithTextAndVideo {
 public static void main(String[] args) {
 // TODO(developer): Replace these variables before running the sample.
 String modelId = "gemini-2.5-flash";
 countTokens(modelId);
 }
 // Counts tokens with text and video inputs
 public static Optional<Integer> countTokens(String modelId) {
 // Initialize client that will be used to send requests. This client only needs to be created
 // once, and can be reused for multiple requests.
 try (Client client =
 Client.builder()
 .location("global")
 .vertexAI(true)
 .httpOptions(HttpOptions.builder().apiVersion("v1").build())
 .build()) {
 Content content =
 Content.fromParts(
 Part.fromText("Provide a description of this video"),
 Part.fromUri("gs://cloud-samples-data/generative-ai/video/pixel8.mp4", "video/mp4"));
 CountTokensResponse response = client.models.countTokens(modelId, List.of(content), null);
 System.out.print(response);
 // Example response:
 // CountTokensResponse{totalTokens=Optional[16707], cachedContentTokenCount=Optional.empty}
 return response.totalTokens();
 }
 }
}

REST

To get the token count for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

MODEL_ID="gemini-2.5-flash"
PROJECT_ID="my-project"
TEXT="Provide a summary with about two sentences for the following article."
REGION="us-central1"
curl\
-XPOST\
-H"Authorization: Bearer $(gcloudauthprint-access-token)"\
-H"Content-Type: application/json"\
https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens-d\
$'{
 "contents": [{
 "role": "user",
 "parts": [
 {
 "file_data": {
 "file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
 "mime_type": "video/mp4"
 }
 },
 {
 "text": "'"$TEXT"'"
 }]
 }]
 }'

Pricing and quota

There is no charge or quota restriction for using the CountTokens API. The maximum quota for the CountTokens API is 3000 requests per minute.

What's next

Learn how to use use Vertex AI SDK for Python to list and count tokens (Preview)
Learn about sending chat prompts and text generation

Use the Count Tokens API Stay organized with collections Save and categorize content based on your preferences.

Supported models

Get the token count for a prompt

Console

Python

Install

Go

Node.js

Install

Java

REST

curl

PowerShell

Response

Example for text with image or video:

Python

Install

Go

Node.js

Install

Java

REST

Pricing and quota

What's next

Use the Count Tokens API