Structured output for open models

Structured outputs enable a model to generate output that always adheres to a specific schema. For example, a model may be provided with a response schema to ensure that the response produces valid JSON. All open models available on the Vertex AI Model as a Service (MaaS) support structured outputs.

For more conceptual information about the structured output capability, see Introduction to structured output.

Use structured outputs

The following use case sets a response schema that ensures that the model output is a JSON object with the following properties: name, date, and participants. The Python code uses the OpenAI SDK and Pydantic objects to generate the JSON schema.

frompydanticimport BaseModel
fromopenaiimport OpenAI
client = OpenAI()
classCalendarEvent(BaseModel):
 name: str
 date: str
 participants: list[str]
completion = client.beta.chat.completions.parse(
 model="MODEL_NAME",
 messages=[
 {"role": "system", "content": "Extract the event information."},
 {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
 ],
 response_format=CalendarEvent,
)
print(completion.choices[0].message.parsed)

The model output will adhere to the following JSON schema:

{"name":STRING,"date":STRING,"participants":[STRING]}

When provided the prompt, "Alice and Bob are going to a science fair on Friday", the model could produce the following response:

{
"name":"science fair",
"date":"Friday",
"participants":[
"Alice",
"Bob"
]
}

Detailed example

The following code is an example of a recursive schema. The UI class contains a list of children, which can also be of the UI class.

frompydanticimport BaseModel
fromopenaiimport OpenAI
fromenumimport Enum
fromtypingimport List
client = OpenAI()
classUIType(str, Enum):
 div = "div"
 button = "button"
 header = "header"
 section = "section"
 field = "field"
 form = "form"
classAttribute(BaseModel):
 name: str
 value: str
classUI(BaseModel):
 type: UIType
 label: str
 children: List["UI"]
 attributes: List[Attribute]
UI.model_rebuild() # This is required to enable recursive types
classResponse(BaseModel):
 ui: UI
completion = client.beta.chat.completions.parse(
 model="MODEL_NAME",
 messages=[
 {"role": "system", "content": "You are a UI generator AI. Convert the user input into a UI."},
 {"role": "user", "content": "Make a User Profile Form"}
 ],
 response_format=Response,
)
print(completion.choices[0].message.parsed)

The model output will adhere to the schema of the Pydantic object specified in the previous snippet. In this example, the model could generate the following UI form:

Form
 Input
 Name
 Email
 Age

A response could look like the following:

ui = UI(
 type=UIType.div,
 label='Form',
 children=[
 UI(
 type=UIType.div,
 label='Input',
 children=[],
 attributes=[
 Attribute(name='label', value='Name')
 ]
 ),
 UI(
 type=UIType.div,
 label='Input',
 children=[],
 attributes=[
 Attribute(name='label', value='Email')
 ]
 ),
 UI(
 type=UIType.div,
 label='Input',
 children=[],
 attributes=[
 Attribute(name='label', value='Age')
 ]
 )
 ],
 attributes=[
 Attribute(name='name', value='John Doe'),
 Attribute(name='email', value='john.doe@example.com'),
 Attribute(name='age', value='30')
 ]
)

Get JSON object responses

You can constrain the model to output only syntactically valid JSON objects by setting the response_format field to { "type": "json_object" }. This is often called JSON mode. JSON mode is is useful when generating JSON for use with function calling or other downstream tasks that require JSON input.

When JSON mode is enabled, the model is constrained to only generate strings that parse into valid JSON objects. While this mode ensures the output is syntactically correct JSON, it does not enforce any specific schema. To ensure the model outputs JSON that follows a specific schema, you must include instructions in the prompt, as shown in the following example.

The following samples show how to enable JSON mode and instruct the model to return a JSON object with a specific structure:

Python

Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Before running this sample, make sure to set the OPENAI_BASE_URL environment variable. For more information, see Authentication and credentials.

fromopenaiimport OpenAI
client = OpenAI()
response = client.chat.completions.create(
 model="MODEL",
 response_format={ "type": "json_object" },
 messages=[
 {"role": "user", "content": "List 5 rivers in South America. Your response must be a JSON object with a single key \"rivers\", which has a list of strings as its value."},
 ]
)
print(response.choices[0].message.content)

Replace MODEL with the model name you want to use, for example meta/llama3-405b-instruct-maas.

Example output

{
 "rivers": [
 "Amazon River",
 "Paraná River",
 "Orinoco River",
 "São Francisco River",
 "Magdalena River"
 ]
}

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your Google Cloud project ID.
LOCATION: A region that supports open models.
MODEL: The model name you want to use, for example meta/llama3-405b-instruct-maas.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

Request JSON body:

{
 "model": "MODEL",
 "response_format": {
 "type": "json_object"
 },
 "messages": [
 {
 "role": "user",
 "content": "List 5 rivers in South America. Your response must be a JSON object with a single key \"rivers\", which has a list of strings as its value."
 }
 ]
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
 -H "Authorization: Bearer $(gcloud auth print-access-token)" \
 -H "Content-Type: application/json; charset=utf-8" \
 -d @request.json \
 "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
 -Method POST `
 -Headers $headers `
 -ContentType: "application/json; charset=utf-8" `
 -InFile request.json `
 -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Response

{
 "choices": [
 {
 "finish_reason": "stop",
 "index": 0,
 "logprobs": null,
 "message": {
 "content": "{\n \"rivers\": [\n \"Amazon River\",\n \"Paraná River\",\n \"Orinoco River\",\n \"São Francisco River\",\n \"Magdalena River\"\n ]\n}",
 "role": "assistant"
 }
 }
 ],
 "created": 1721860000,
 "id": "123456789",
 "model": "meta/llama3-405b-instruct-maas",
 "object": "chat.completion",
 "system_fingerprint": "12345",
 "usage": {
 "completion_tokens": 54,
 "prompt_tokens": 46,
 "total_tokens": 100
 }
}

What's next

Learn about Function calling.
Learn about Thinking.
Learn about Batch predictions.

Structured output for open models Stay organized with collections Save and categorize content based on your preferences.

Use structured outputs

Detailed example

Get JSON object responses

Python

Example output

REST

curl

PowerShell

Response

What's next

Structured output for open models