Class PredictionServiceClient (1.23.0)

publicclass PredictionServiceClientimplementsBackgroundResource

Service Description: A service for online predictions and explanations.

This class provides the ability to make remote calls to the backing service through method calls that map to API methods. Sample code to get started:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
EndpointNameendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]");
List<Value>instances=newArrayList<>();
Valueparameters=Value.newBuilder().setBoolValue(true).build();
PredictResponseresponse=predictionServiceClient.predict(endpoint,instances,parameters);
}

Note: close() needs to be called on the PredictionServiceClient object to clean up resources such as threads. In the example above, try-with-resources is used, which automatically calls close().

Methods
Method Description Method Variants

Predict

Perform an online prediction.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • predict(PredictRequest request)

"Flattened" method variants have converted the fields of the request object into function parameters to enable multiple ways to call the same method.

  • predict(EndpointName endpoint, List<Value> instances, Value parameters)

  • predict(String endpoint, List<Value> instances, Value parameters)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • predictCallable()

RawPredict

Perform an online prediction with an arbitrary HTTP payload.

The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • rawPredict(RawPredictRequest request)

"Flattened" method variants have converted the fields of the request object into function parameters to enable multiple ways to call the same method.

  • rawPredict(EndpointName endpoint, HttpBody httpBody)

  • rawPredict(String endpoint, HttpBody httpBody)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • rawPredictCallable()

StreamRawPredict

Perform a streaming online prediction with an arbitrary HTTP payload.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamRawPredictCallable()

DirectPredict

Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • directPredict(DirectPredictRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • directPredictCallable()

DirectRawPredict

Perform an unary online prediction request to a gRPC model server for custom containers.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • directRawPredict(DirectRawPredictRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • directRawPredictCallable()

StreamDirectPredict

Perform a streaming online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamDirectPredictCallable()

StreamDirectRawPredict

Perform a streaming online prediction request to a gRPC model server for custom containers.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamDirectRawPredictCallable()

StreamingPredict

Perform a streaming online prediction request for Vertex first-party products and frameworks.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamingPredictCallable()

ServerStreamingPredict

Perform a server-side streaming online prediction request for Vertex LLM streaming.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • serverStreamingPredictCallable()

StreamingRawPredict

Perform a streaming online prediction request through gRPC.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamingRawPredictCallable()

Explain

Perform an online explanation.

If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • explain(ExplainRequest request)

"Flattened" method variants have converted the fields of the request object into function parameters to enable multiple ways to call the same method.

  • explain(EndpointName endpoint, List<Value> instances, Value parameters, String deployedModelId)

  • explain(String endpoint, List<Value> instances, Value parameters, String deployedModelId)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • explainCallable()

GenerateContent

Generate content with multimodal inputs.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • generateContent(GenerateContentRequest request)

"Flattened" method variants have converted the fields of the request object into function parameters to enable multiple ways to call the same method.

  • generateContent(String model, List<Content> contents)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • generateContentCallable()

StreamGenerateContent

Generate content with multimodal inputs with streaming support.

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • streamGenerateContentCallable()

ListLocations

Lists information about the supported locations for this service.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • listLocations(ListLocationsRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • listLocationsPagedCallable()

  • listLocationsCallable()

GetLocation

Gets information about a location.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • getLocation(GetLocationRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • getLocationCallable()

SetIamPolicy

Sets the access control policy on the specified resource. Replacesany existing policy.

Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIEDerrors.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • setIamPolicy(SetIamPolicyRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • setIamPolicyCallable()

GetIamPolicy

Gets the access control policy for a resource. Returns an empty policyif the resource exists and does not have a policy set.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • getIamPolicy(GetIamPolicyRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • getIamPolicyCallable()

TestIamPermissions

Returns permissions that a caller has on the specified resource. If theresource does not exist, this will return an empty set ofpermissions, not a NOT_FOUND error.

Note: This operation is designed to be used for buildingpermission-aware UIs and command-line tools, not for authorizationchecking. This operation may "fail open" without warning.

Request object method variants only take one parameter, a request object, which must be constructed before the call.

  • testIamPermissions(TestIamPermissionsRequest request)

Callable method variants take no parameters and return an immutable API callable object, which can be used to initiate calls to the service.

  • testIamPermissionsCallable()

See the individual methods for example code.

Many parameters require resource names to be formatted in a particular way. To assist with these names, this class includes a format method for each type of name, and additionally a parse method to extract the individual identifiers contained within names that are returned.

This class can be customized by passing in a custom instance of PredictionServiceSettings to create(). For example:

To customize credentials:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
PredictionServiceSettingspredictionServiceSettings=
PredictionServiceSettings.newBuilder()
.setCredentialsProvider(FixedCredentialsProvider.create(myCredentials))
.build();
PredictionServiceClientpredictionServiceClient=
PredictionServiceClient.create(predictionServiceSettings);

To customize the endpoint:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
PredictionServiceSettingspredictionServiceSettings=
PredictionServiceSettings.newBuilder().setEndpoint(myEndpoint).build();
PredictionServiceClientpredictionServiceClient=
PredictionServiceClient.create(predictionServiceSettings);

To use REST (HTTP1.1/JSON) transport (instead of gRPC) for sending and receiving requests over the wire:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
PredictionServiceSettingspredictionServiceSettings=
PredictionServiceSettings.newHttpJsonBuilder().build();
PredictionServiceClientpredictionServiceClient=
PredictionServiceClient.create(predictionServiceSettings);

Please refer to the GitHub repository's samples for more quickstart code snippets.

Inheritance

java.lang.Object > PredictionServiceClient

Implements

BackgroundResource

Static Methods

create()

publicstaticfinalPredictionServiceClientcreate()

Constructs an instance of PredictionServiceClient with default settings.

Returns
Type Description
PredictionServiceClient
Exceptions
Type Description
IOException

create(PredictionServiceSettings settings)

publicstaticfinalPredictionServiceClientcreate(PredictionServiceSettingssettings)

Constructs an instance of PredictionServiceClient, using the given settings. The channels are created based on the settings passed in, or defaults for any settings that are not set.

Parameter
Name Description
settings PredictionServiceSettings
Returns
Type Description
PredictionServiceClient
Exceptions
Type Description
IOException

create(PredictionServiceStub stub)

publicstaticfinalPredictionServiceClientcreate(PredictionServiceStubstub)

Constructs an instance of PredictionServiceClient, using the given stub for making calls. This is for advanced usage - prefer using create(PredictionServiceSettings).

Parameter
Name Description
stub PredictionServiceStub
Returns
Type Description
PredictionServiceClient

Constructors

PredictionServiceClient(PredictionServiceSettings settings)

protectedPredictionServiceClient(PredictionServiceSettingssettings)

Constructs an instance of PredictionServiceClient, using the given settings. This is protected so that it is easy to make a subclass, but otherwise, the static factory methods should be preferred.

Parameter
Name Description
settings PredictionServiceSettings

PredictionServiceClient(PredictionServiceStub stub)

protectedPredictionServiceClient(PredictionServiceStubstub)
Parameter
Name Description
stub PredictionServiceStub

Methods

awaitTermination(long duration, TimeUnit unit)

publicbooleanawaitTermination(longduration,TimeUnitunit)
Parameters
Name Description
duration long
unit TimeUnit
Returns
Type Description
boolean
Exceptions
Type Description
InterruptedException

close()

publicfinalvoidclose()

directPredict(DirectPredictRequest request)

publicfinalDirectPredictResponsedirectPredict(DirectPredictRequestrequest)

Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
DirectPredictRequestrequest=
DirectPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInputs(newArrayList<Tensor>())
.setParameters(Tensor.newBuilder().build())
.build();
DirectPredictResponseresponse=predictionServiceClient.directPredict(request);
}
Parameter
Name Description
request DirectPredictRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
DirectPredictResponse

directPredictCallable()

publicfinalUnaryCallable<DirectPredictRequest,DirectPredictResponse>directPredictCallable()

Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
DirectPredictRequestrequest=
DirectPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInputs(newArrayList<Tensor>())
.setParameters(Tensor.newBuilder().build())
.build();
ApiFuture<DirectPredictResponse>future=
predictionServiceClient.directPredictCallable().futureCall(request);
// Do something.
DirectPredictResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<DirectPredictRequest,DirectPredictResponse>

directRawPredict(DirectRawPredictRequest request)

publicfinalDirectRawPredictResponsedirectRawPredict(DirectRawPredictRequestrequest)

Perform an unary online prediction request to a gRPC model server for custom containers.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
DirectRawPredictRequestrequest=
DirectRawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setMethodName("methodName-723163380")
.setInput(ByteString.EMPTY)
.build();
DirectRawPredictResponseresponse=predictionServiceClient.directRawPredict(request);
}
Parameter
Name Description
request DirectRawPredictRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
DirectRawPredictResponse

directRawPredictCallable()

publicfinalUnaryCallable<DirectRawPredictRequest,DirectRawPredictResponse>directRawPredictCallable()

Perform an unary online prediction request to a gRPC model server for custom containers.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
DirectRawPredictRequestrequest=
DirectRawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setMethodName("methodName-723163380")
.setInput(ByteString.EMPTY)
.build();
ApiFuture<DirectRawPredictResponse>future=
predictionServiceClient.directRawPredictCallable().futureCall(request);
// Do something.
DirectRawPredictResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<DirectRawPredictRequest,DirectRawPredictResponse>

explain(EndpointName endpoint, List<Value> instances, Value parameters, String deployedModelId)

publicfinalExplainResponseexplain(EndpointNameendpoint,List<Value>instances,Valueparameters,StringdeployedModelId)

Perform an online explanation.

If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
EndpointNameendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]");
List<Value>instances=newArrayList<>();
Valueparameters=Value.newBuilder().setBoolValue(true).build();
StringdeployedModelId="deployedModelId-1817547906";
ExplainResponseresponse=
predictionServiceClient.explain(endpoint,instances,parameters,deployedModelId);
}
Parameters
Name Description
endpoint EndpointName

Required. The name of the Endpoint requested to serve the explanation. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

instances List<Value>

Required. The instances that are the input to the explanation call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the explanation call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' Model's PredictSchemata's instance_schema_uri.

parameters Value

The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' Model's PredictSchemata's parameters_schema_uri.

deployedModelId String

If specified, this ExplainRequest will be served by the chosen DeployedModel, overriding Endpoint.traffic_split.

Returns
Type Description
ExplainResponse

explain(ExplainRequest request)

publicfinalExplainResponseexplain(ExplainRequestrequest)

Perform an online explanation.

If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
ExplainRequestrequest=
ExplainRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInstances(newArrayList<Value>())
.setParameters(Value.newBuilder().setBoolValue(true).build())
.setExplanationSpecOverride(ExplanationSpecOverride.newBuilder().build())
.setDeployedModelId("deployedModelId-1817547906")
.build();
ExplainResponseresponse=predictionServiceClient.explain(request);
}
Parameter
Name Description
request ExplainRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
ExplainResponse

explain(String endpoint, List<Value> instances, Value parameters, String deployedModelId)

publicfinalExplainResponseexplain(Stringendpoint,List<Value>instances,Valueparameters,StringdeployedModelId)

Perform an online explanation.

If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
Stringendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString();
List<Value>instances=newArrayList<>();
Valueparameters=Value.newBuilder().setBoolValue(true).build();
StringdeployedModelId="deployedModelId-1817547906";
ExplainResponseresponse=
predictionServiceClient.explain(endpoint,instances,parameters,deployedModelId);
}
Parameters
Name Description
endpoint String

Required. The name of the Endpoint requested to serve the explanation. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

instances List<Value>

Required. The instances that are the input to the explanation call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the explanation call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' Model's PredictSchemata's instance_schema_uri.

parameters Value

The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' Model's PredictSchemata's parameters_schema_uri.

deployedModelId String

If specified, this ExplainRequest will be served by the chosen DeployedModel, overriding Endpoint.traffic_split.

Returns
Type Description
ExplainResponse

explainCallable()

publicfinalUnaryCallable<ExplainRequest,ExplainResponse>explainCallable()

Perform an online explanation.

If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
ExplainRequestrequest=
ExplainRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInstances(newArrayList<Value>())
.setParameters(Value.newBuilder().setBoolValue(true).build())
.setExplanationSpecOverride(ExplanationSpecOverride.newBuilder().build())
.setDeployedModelId("deployedModelId-1817547906")
.build();
ApiFuture<ExplainResponse>future=
predictionServiceClient.explainCallable().futureCall(request);
// Do something.
ExplainResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<ExplainRequest,ExplainResponse>

generateContent(GenerateContentRequest request)

publicfinalGenerateContentResponsegenerateContent(GenerateContentRequestrequest)

Generate content with multimodal inputs.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GenerateContentRequestrequest=
GenerateContentRequest.newBuilder()
.setModel("model104069929")
.addAllContents(newArrayList<Content>())
.setSystemInstruction(Content.newBuilder().build())
.setCachedContent(
CachedContentName.of("[PROJECT]","[LOCATION]","[CACHED_CONTENT]").toString())
.addAllTools(newArrayList<Tool>())
.setToolConfig(ToolConfig.newBuilder().build())
.putAllLabels(newHashMap<String,String>())
.addAllSafetySettings(newArrayList<SafetySetting>())
.setGenerationConfig(GenerationConfig.newBuilder().build())
.build();
GenerateContentResponseresponse=predictionServiceClient.generateContent(request);
}
Parameter
Name Description
request GenerateContentRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
GenerateContentResponse

generateContent(String model, List<Content> contents)

publicfinalGenerateContentResponsegenerateContent(Stringmodel,List<Content>contents)

Generate content with multimodal inputs.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
Stringmodel="model104069929";
List<Content>contents=newArrayList<>();
GenerateContentResponseresponse=predictionServiceClient.generateContent(model,contents);
}
Parameters
Name Description
model String

Required. The fully qualified name of the publisher model or tuned model endpoint to use.

Publisher model format: projects/{project}/locations/{location}/publishers/*/models/*

Tuned model endpoint format: projects/{project}/locations/{location}/endpoints/{endpoint}

contents List<Content>

Required. The content of the current conversation with the model.

For single-turn queries, this is a single instance. For multi-turn queries, this is a repeated field that contains conversation history + latest request.

Returns
Type Description
GenerateContentResponse

generateContentCallable()

publicfinalUnaryCallable<GenerateContentRequest,GenerateContentResponse>generateContentCallable()

Generate content with multimodal inputs.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GenerateContentRequestrequest=
GenerateContentRequest.newBuilder()
.setModel("model104069929")
.addAllContents(newArrayList<Content>())
.setSystemInstruction(Content.newBuilder().build())
.setCachedContent(
CachedContentName.of("[PROJECT]","[LOCATION]","[CACHED_CONTENT]").toString())
.addAllTools(newArrayList<Tool>())
.setToolConfig(ToolConfig.newBuilder().build())
.putAllLabels(newHashMap<String,String>())
.addAllSafetySettings(newArrayList<SafetySetting>())
.setGenerationConfig(GenerationConfig.newBuilder().build())
.build();
ApiFuture<GenerateContentResponse>future=
predictionServiceClient.generateContentCallable().futureCall(request);
// Do something.
GenerateContentResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<GenerateContentRequest,GenerateContentResponse>

getIamPolicy(GetIamPolicyRequest request)

publicfinalPolicygetIamPolicy(GetIamPolicyRequestrequest)

Gets the access control policy for a resource. Returns an empty policyif the resource exists and does not have a policy set.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GetIamPolicyRequestrequest=
GetIamPolicyRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setOptions(GetPolicyOptions.newBuilder().build())
.build();
Policyresponse=predictionServiceClient.getIamPolicy(request);
}
Parameter
Name Description
request com.google.iam.v1.GetIamPolicyRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
com.google.iam.v1.Policy

getIamPolicyCallable()

publicfinalUnaryCallable<GetIamPolicyRequest,Policy>getIamPolicyCallable()

Gets the access control policy for a resource. Returns an empty policyif the resource exists and does not have a policy set.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GetIamPolicyRequestrequest=
GetIamPolicyRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setOptions(GetPolicyOptions.newBuilder().build())
.build();
ApiFuture<Policy>future=predictionServiceClient.getIamPolicyCallable().futureCall(request);
// Do something.
Policyresponse=future.get();
}
Returns
Type Description
UnaryCallable<com.google.iam.v1.GetIamPolicyRequest,com.google.iam.v1.Policy>

getLocation(GetLocationRequest request)

publicfinalLocationgetLocation(GetLocationRequestrequest)

Gets information about a location.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GetLocationRequestrequest=GetLocationRequest.newBuilder().setName("name3373707").build();
Locationresponse=predictionServiceClient.getLocation(request);
}
Parameter
Name Description
request com.google.cloud.location.GetLocationRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
com.google.cloud.location.Location

getLocationCallable()

publicfinalUnaryCallable<GetLocationRequest,Location>getLocationCallable()

Gets information about a location.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GetLocationRequestrequest=GetLocationRequest.newBuilder().setName("name3373707").build();
ApiFuture<Location>future=
predictionServiceClient.getLocationCallable().futureCall(request);
// Do something.
Locationresponse=future.get();
}
Returns
Type Description
UnaryCallable<com.google.cloud.location.GetLocationRequest,com.google.cloud.location.Location>

getSettings()

publicfinalPredictionServiceSettingsgetSettings()
Returns
Type Description
PredictionServiceSettings

getStub()

publicPredictionServiceStubgetStub()
Returns
Type Description
PredictionServiceStub

isShutdown()

publicbooleanisShutdown()
Returns
Type Description
boolean

isTerminated()

publicbooleanisTerminated()
Returns
Type Description
boolean

listLocations(ListLocationsRequest request)

publicfinalPredictionServiceClient.ListLocationsPagedResponselistLocations(ListLocationsRequestrequest)

Lists information about the supported locations for this service.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
ListLocationsRequestrequest=
ListLocationsRequest.newBuilder()
.setName("name3373707")
.setFilter("filter-1274492040")
.setPageSize(883849137)
.setPageToken("pageToken873572522")
.build();
for(Locationelement:predictionServiceClient.listLocations(request).iterateAll()){
// doThingsWith(element);
}
}
Parameter
Name Description
request com.google.cloud.location.ListLocationsRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
PredictionServiceClient.ListLocationsPagedResponse

listLocationsCallable()

publicfinalUnaryCallable<ListLocationsRequest,ListLocationsResponse>listLocationsCallable()

Lists information about the supported locations for this service.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
ListLocationsRequestrequest=
ListLocationsRequest.newBuilder()
.setName("name3373707")
.setFilter("filter-1274492040")
.setPageSize(883849137)
.setPageToken("pageToken873572522")
.build();
while(true){
ListLocationsResponseresponse=
predictionServiceClient.listLocationsCallable().call(request);
for(Locationelement:response.getLocationsList()){
// doThingsWith(element);
}
StringnextPageToken=response.getNextPageToken();
if(!Strings.isNullOrEmpty(nextPageToken)){
request=request.toBuilder().setPageToken(nextPageToken).build();
}else{
break;
}
}
}
Returns
Type Description
UnaryCallable<com.google.cloud.location.ListLocationsRequest,com.google.cloud.location.ListLocationsResponse>

listLocationsPagedCallable()

publicfinalUnaryCallable<ListLocationsRequest,PredictionServiceClient.ListLocationsPagedResponse>listLocationsPagedCallable()

Lists information about the supported locations for this service.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
ListLocationsRequestrequest=
ListLocationsRequest.newBuilder()
.setName("name3373707")
.setFilter("filter-1274492040")
.setPageSize(883849137)
.setPageToken("pageToken873572522")
.build();
ApiFuture<Location>future=
predictionServiceClient.listLocationsPagedCallable().futureCall(request);
// Do something.
for(Locationelement:future.get().iterateAll()){
// doThingsWith(element);
}
}
Returns
Type Description
UnaryCallable<com.google.cloud.location.ListLocationsRequest,ListLocationsPagedResponse>

predict(EndpointName endpoint, List<Value> instances, Value parameters)

publicfinalPredictResponsepredict(EndpointNameendpoint,List<Value>instances,Valueparameters)

Perform an online prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
EndpointNameendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]");
List<Value>instances=newArrayList<>();
Valueparameters=Value.newBuilder().setBoolValue(true).build();
PredictResponseresponse=predictionServiceClient.predict(endpoint,instances,parameters);
}
Parameters
Name Description
endpoint EndpointName

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

instances List<Value>

Required. The instances that are the input to the prediction call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' Model's PredictSchemata's instance_schema_uri.

parameters Value

The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' Model's PredictSchemata's parameters_schema_uri.

Returns
Type Description
PredictResponse

predict(PredictRequest request)

publicfinalPredictResponsepredict(PredictRequestrequest)

Perform an online prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
PredictRequestrequest=
PredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInstances(newArrayList<Value>())
.setParameters(Value.newBuilder().setBoolValue(true).build())
.build();
PredictResponseresponse=predictionServiceClient.predict(request);
}
Parameter
Name Description
request PredictRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
PredictResponse

predict(String endpoint, List<Value> instances, Value parameters)

publicfinalPredictResponsepredict(Stringendpoint,List<Value>instances,Valueparameters)

Perform an online prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
Stringendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString();
List<Value>instances=newArrayList<>();
Valueparameters=Value.newBuilder().setBoolValue(true).build();
PredictResponseresponse=predictionServiceClient.predict(endpoint,instances,parameters);
}
Parameters
Name Description
endpoint String

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

instances List<Value>

Required. The instances that are the input to the prediction call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' Model's PredictSchemata's instance_schema_uri.

parameters Value

The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' Model's PredictSchemata's parameters_schema_uri.

Returns
Type Description
PredictResponse

predictCallable()

publicfinalUnaryCallable<PredictRequest,PredictResponse>predictCallable()

Perform an online prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
PredictRequestrequest=
PredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInstances(newArrayList<Value>())
.setParameters(Value.newBuilder().setBoolValue(true).build())
.build();
ApiFuture<PredictResponse>future=
predictionServiceClient.predictCallable().futureCall(request);
// Do something.
PredictResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<PredictRequest,PredictResponse>

rawPredict(EndpointName endpoint, HttpBody httpBody)

publicfinalHttpBodyrawPredict(EndpointNameendpoint,HttpBodyhttpBody)

Perform an online prediction with an arbitrary HTTP payload.

The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
EndpointNameendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]");
HttpBodyhttpBody=HttpBody.newBuilder().build();
HttpBodyresponse=predictionServiceClient.rawPredict(endpoint,httpBody);
}
Parameters
Name Description
endpoint EndpointName

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

httpBody com.google.api.HttpBody

The prediction input. Supports HTTP headers and arbitrary data payload.

A DeployedModel may have an upper limit on the number of instances it supports per request. When this limit it is exceeded for an AutoML model, the RawPredict method returns an error. When this limit is exceeded for a custom-trained model, the behavior varies depending on the model.

You can specify the schema for each instance in the predict_schemata.instance_schema_uri field when you create a Model. This schema applies when you deploy the Model as a DeployedModel to an Endpoint and use the RawPredict method.

Returns
Type Description
com.google.api.HttpBody

rawPredict(RawPredictRequest request)

publicfinalHttpBodyrawPredict(RawPredictRequestrequest)

Perform an online prediction with an arbitrary HTTP payload.

The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
RawPredictRequestrequest=
RawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setHttpBody(HttpBody.newBuilder().build())
.build();
HttpBodyresponse=predictionServiceClient.rawPredict(request);
}
Parameter
Name Description
request RawPredictRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
com.google.api.HttpBody

rawPredict(String endpoint, HttpBody httpBody)

publicfinalHttpBodyrawPredict(Stringendpoint,HttpBodyhttpBody)

Perform an online prediction with an arbitrary HTTP payload.

The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
Stringendpoint=
EndpointName.ofProjectLocationEndpointName("[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString();
HttpBodyhttpBody=HttpBody.newBuilder().build();
HttpBodyresponse=predictionServiceClient.rawPredict(endpoint,httpBody);
}
Parameters
Name Description
endpoint String

Required. The name of the Endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

httpBody com.google.api.HttpBody

The prediction input. Supports HTTP headers and arbitrary data payload.

A DeployedModel may have an upper limit on the number of instances it supports per request. When this limit it is exceeded for an AutoML model, the RawPredict method returns an error. When this limit is exceeded for a custom-trained model, the behavior varies depending on the model.

You can specify the schema for each instance in the predict_schemata.instance_schema_uri field when you create a Model. This schema applies when you deploy the Model as a DeployedModel to an Endpoint and use the RawPredict method.

Returns
Type Description
com.google.api.HttpBody

rawPredictCallable()

publicfinalUnaryCallable<RawPredictRequest,HttpBody>rawPredictCallable()

Perform an online prediction with an arbitrary HTTP payload.

The response includes the following HTTP headers:

  • X-Vertex-AI-Endpoint-Id: ID of the Endpoint that served this prediction.
  • X-Vertex-AI-Deployed-Model-Id: ID of the Endpoint's DeployedModel that served this prediction.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
RawPredictRequestrequest=
RawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setHttpBody(HttpBody.newBuilder().build())
.build();
ApiFuture<HttpBody>future=predictionServiceClient.rawPredictCallable().futureCall(request);
// Do something.
HttpBodyresponse=future.get();
}
Returns
Type Description
UnaryCallable<RawPredictRequest,com.google.api.HttpBody>

serverStreamingPredictCallable()

publicfinalServerStreamingCallable<StreamingPredictRequest,StreamingPredictResponse>serverStreamingPredictCallable()

Perform a server-side streaming online prediction request for Vertex LLM streaming.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
StreamingPredictRequestrequest=
StreamingPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInputs(newArrayList<Tensor>())
.setParameters(Tensor.newBuilder().build())
.build();
ServerStream<StreamingPredictResponse>stream=
predictionServiceClient.serverStreamingPredictCallable().call(request);
for(StreamingPredictResponseresponse:stream){
// Do something when a response is received.
}
}
Returns
Type Description
ServerStreamingCallable<StreamingPredictRequest,StreamingPredictResponse>

setIamPolicy(SetIamPolicyRequest request)

publicfinalPolicysetIamPolicy(SetIamPolicyRequestrequest)

Sets the access control policy on the specified resource. Replacesany existing policy.

Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIEDerrors.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
SetIamPolicyRequestrequest=
SetIamPolicyRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setPolicy(Policy.newBuilder().build())
.setUpdateMask(FieldMask.newBuilder().build())
.build();
Policyresponse=predictionServiceClient.setIamPolicy(request);
}
Parameter
Name Description
request com.google.iam.v1.SetIamPolicyRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
com.google.iam.v1.Policy

setIamPolicyCallable()

publicfinalUnaryCallable<SetIamPolicyRequest,Policy>setIamPolicyCallable()

Sets the access control policy on the specified resource. Replacesany existing policy.

Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIEDerrors.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
SetIamPolicyRequestrequest=
SetIamPolicyRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setPolicy(Policy.newBuilder().build())
.setUpdateMask(FieldMask.newBuilder().build())
.build();
ApiFuture<Policy>future=predictionServiceClient.setIamPolicyCallable().futureCall(request);
// Do something.
Policyresponse=future.get();
}
Returns
Type Description
UnaryCallable<com.google.iam.v1.SetIamPolicyRequest,com.google.iam.v1.Policy>

shutdown()

publicvoidshutdown()

shutdownNow()

publicvoidshutdownNow()

streamDirectPredictCallable()

publicfinalBidiStreamingCallable<StreamDirectPredictRequest,StreamDirectPredictResponse>streamDirectPredictCallable()

Perform a streaming online prediction request to a gRPC model server for Vertex first-party products and frameworks.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
BidiStream<StreamDirectPredictRequest,StreamDirectPredictResponse>bidiStream=
predictionServiceClient.streamDirectPredictCallable().call();
StreamDirectPredictRequestrequest=
StreamDirectPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInputs(newArrayList<Tensor>())
.setParameters(Tensor.newBuilder().build())
.build();
bidiStream.send(request);
for(StreamDirectPredictResponseresponse:bidiStream){
// Do something when a response is received.
}
}
Returns
Type Description
BidiStreamingCallable<StreamDirectPredictRequest,StreamDirectPredictResponse>

streamDirectRawPredictCallable()

publicfinalBidiStreamingCallable<StreamDirectRawPredictRequest,StreamDirectRawPredictResponse>streamDirectRawPredictCallable()

Perform a streaming online prediction request to a gRPC model server for custom containers.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
BidiStream<StreamDirectRawPredictRequest,StreamDirectRawPredictResponse>bidiStream=
predictionServiceClient.streamDirectRawPredictCallable().call();
StreamDirectRawPredictRequestrequest=
StreamDirectRawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setMethodName("methodName-723163380")
.setInput(ByteString.EMPTY)
.build();
bidiStream.send(request);
for(StreamDirectRawPredictResponseresponse:bidiStream){
// Do something when a response is received.
}
}
Returns
Type Description
BidiStreamingCallable<StreamDirectRawPredictRequest,StreamDirectRawPredictResponse>

streamGenerateContentCallable()

publicfinalServerStreamingCallable<GenerateContentRequest,GenerateContentResponse>streamGenerateContentCallable()

Generate content with multimodal inputs with streaming support.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
GenerateContentRequestrequest=
GenerateContentRequest.newBuilder()
.setModel("model104069929")
.addAllContents(newArrayList<Content>())
.setSystemInstruction(Content.newBuilder().build())
.setCachedContent(
CachedContentName.of("[PROJECT]","[LOCATION]","[CACHED_CONTENT]").toString())
.addAllTools(newArrayList<Tool>())
.setToolConfig(ToolConfig.newBuilder().build())
.putAllLabels(newHashMap<String,String>())
.addAllSafetySettings(newArrayList<SafetySetting>())
.setGenerationConfig(GenerationConfig.newBuilder().build())
.build();
ServerStream<GenerateContentResponse>stream=
predictionServiceClient.streamGenerateContentCallable().call(request);
for(GenerateContentResponseresponse:stream){
// Do something when a response is received.
}
}
Returns
Type Description
ServerStreamingCallable<GenerateContentRequest,GenerateContentResponse>

streamRawPredictCallable()

publicfinalServerStreamingCallable<StreamRawPredictRequest,HttpBody>streamRawPredictCallable()

Perform a streaming online prediction with an arbitrary HTTP payload.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
StreamRawPredictRequestrequest=
StreamRawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setHttpBody(HttpBody.newBuilder().build())
.build();
ServerStream<HttpBody>stream=
predictionServiceClient.streamRawPredictCallable().call(request);
for(HttpBodyresponse:stream){
// Do something when a response is received.
}
}
Returns
Type Description
ServerStreamingCallable<StreamRawPredictRequest,com.google.api.HttpBody>

streamingPredictCallable()

publicfinalBidiStreamingCallable<StreamingPredictRequest,StreamingPredictResponse>streamingPredictCallable()

Perform a streaming online prediction request for Vertex first-party products and frameworks.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
BidiStream<StreamingPredictRequest,StreamingPredictResponse>bidiStream=
predictionServiceClient.streamingPredictCallable().call();
StreamingPredictRequestrequest=
StreamingPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllInputs(newArrayList<Tensor>())
.setParameters(Tensor.newBuilder().build())
.build();
bidiStream.send(request);
for(StreamingPredictResponseresponse:bidiStream){
// Do something when a response is received.
}
}
Returns
Type Description
BidiStreamingCallable<StreamingPredictRequest,StreamingPredictResponse>

streamingRawPredictCallable()

publicfinalBidiStreamingCallable<StreamingRawPredictRequest,StreamingRawPredictResponse>streamingRawPredictCallable()

Perform a streaming online prediction request through gRPC.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
BidiStream<StreamingRawPredictRequest,StreamingRawPredictResponse>bidiStream=
predictionServiceClient.streamingRawPredictCallable().call();
StreamingRawPredictRequestrequest=
StreamingRawPredictRequest.newBuilder()
.setEndpoint(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.setMethodName("methodName-723163380")
.setInput(ByteString.EMPTY)
.build();
bidiStream.send(request);
for(StreamingRawPredictResponseresponse:bidiStream){
// Do something when a response is received.
}
}
Returns
Type Description
BidiStreamingCallable<StreamingRawPredictRequest,StreamingRawPredictResponse>

testIamPermissions(TestIamPermissionsRequest request)

publicfinalTestIamPermissionsResponsetestIamPermissions(TestIamPermissionsRequestrequest)

Returns permissions that a caller has on the specified resource. If theresource does not exist, this will return an empty set ofpermissions, not a NOT_FOUND error.

Note: This operation is designed to be used for buildingpermission-aware UIs and command-line tools, not for authorizationchecking. This operation may "fail open" without warning.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
TestIamPermissionsRequestrequest=
TestIamPermissionsRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllPermissions(newArrayList<String>())
.build();
TestIamPermissionsResponseresponse=predictionServiceClient.testIamPermissions(request);
}
Parameter
Name Description
request com.google.iam.v1.TestIamPermissionsRequest

The request object containing all of the parameters for the API call.

Returns
Type Description
com.google.iam.v1.TestIamPermissionsResponse

testIamPermissionsCallable()

publicfinalUnaryCallable<TestIamPermissionsRequest,TestIamPermissionsResponse>testIamPermissionsCallable()

Returns permissions that a caller has on the specified resource. If theresource does not exist, this will return an empty set ofpermissions, not a NOT_FOUND error.

Note: This operation is designed to be used for buildingpermission-aware UIs and command-line tools, not for authorizationchecking. This operation may "fail open" without warning.

Sample code:


// This snippet has been automatically generated and should be regarded as a code template only.
// It will require modifications to work:
// - It may require correct/in-range values for request initialization.
// - It may require specifying regional endpoints when creating the service client as shown in
// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
try(PredictionServiceClientpredictionServiceClient=PredictionServiceClient.create()){
TestIamPermissionsRequestrequest=
TestIamPermissionsRequest.newBuilder()
.setResource(
EndpointName.ofProjectLocationEndpointName(
"[PROJECT]","[LOCATION]","[ENDPOINT]")
.toString())
.addAllPermissions(newArrayList<String>())
.build();
ApiFuture<TestIamPermissionsResponse>future=
predictionServiceClient.testIamPermissionsCallable().futureCall(request);
// Do something.
TestIamPermissionsResponseresponse=future.get();
}
Returns
Type Description
UnaryCallable<com.google.iam.v1.TestIamPermissionsRequest,com.google.iam.v1.TestIamPermissionsResponse>

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年11月19日 UTC.