Class ImageTextModel (1.69.0)

ImageTextModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Generates text from images.

Examples::

model = ImageTextModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
 image=image,
 # Optional:
 number_of_results=1,
 language="en",
)
answers = model.ask_question(
 image=image,
 question="What color is the car in this image?",
 # Optional:
 number_of_results=1,
)

Methods

ImageTextModel

ImageTextModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Creates a _ModelGardenModel.

This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=...) instead.

ask_question

ask_question(
 image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1
) -> typing.List[str]

Answers questions about an image.

from_pretrained

from_pretrained(model_name: str) -> vertexai._model_garden._model_garden_models.T

Loads a _ModelGardenModel.

Exceptions
Type Description
ValueError If model_name is unknown.
ValueError If model does not support this class.

get_captions

get_captions(
 image: vertexai.vision_models.Image,
 *,
 number_of_results: int = 1,
 language: str = "en",
 output_gcs_uri: typing.Optional[str] = None
) -> typing.List[str]

Generates captions for a given image.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月30日 UTC.