Configure custom status events to describe runnables
Stay organized with collections
Save and categorize content based on your preferences.
This document explains how to configure custom status events, which describe a job's runnables, when you create and run a Batch job. To learn about status events, see View a job's history through status events.
Custom status events let you provide additional details in a task's history about the progress of its runnables, which can help make a job easier to analyze and troubleshoot. For example, you can configure custom status events that describe when a runnable starts, a runnable ends, a barrier runnable is reached, or an important event happens during the progression of your code.
Before you begin
- If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
-
To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:
-
Batch Job Editor (
roles/batch.jobsEditor) on the project -
Service Account User (
roles/iam.serviceAccountUser) on the job's service account, which by default is the default Compute Engine service account
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
-
Batch Job Editor (
Configure custom status events
Configure custom status events by using one or more of following options when you are creating a job:
Describe a runnable's state by defining its display name. You can do this when you create a job using the gcloud CLI or Batch API.
Indicate important runtime events by writing a structured task log with the
batch/custom/eventfield for each event. You can do this when using any method to create a job as part of the definitions of script and container runnables.
Describe a runnable's state
You can configure custom status events that describe a runnable's state by
defining a runnable's display name
(displayName field).
The resulting custom status events vary slightly for different types of
runnables:
If you define a display name for a container runnable or script runnable, then Batch automatically adds two types of custom status events. The first custom status event indicates whenever a task starts this runnable. The second custom status event indicates whenever a tasks finishes this runnable and the corresponding exit code.
If you define a display name for a barrier runnable , then Batch automatically adds a custom status event that indicates whenever a task reaches this barrier.
To create and run a job with custom status events that describes a
runnable's state, define the displayName field for one or more
runnables using the gcloud CLI, Batch API or
library.
gcloud
Use the Google Cloud CLI to
create a job that
includes the displayName field in one or more runnables definitions
in the JSON file:
...
"runnables":[
{
"displayName":DISPLAY_NAME,
...
}
]
...
For example, a job with custom status events that describes each runnable's state can have a JSON configuration file similar to the following:
{
"taskGroups":[
{
"taskSpec":{
"runnables":[
{
"displayName":"DISPLAY_NAME1",
"script":{
"text":"echo Hello world from script 1 for task ${BATCH_TASK_INDEX}"
}
},
{
"displayName":"DISPLAY_NAME2",
"barrier":{}
},
{
"displayName":"DISPLAY_NAME3",
"script":{
"text":"echo Hello world from script 2 for task ${BATCH_TASK_INDEX}"
}
}
]
},
"taskCount":3
}
],
"logsPolicy":{
"destination":"CLOUD_LOGGING"
}
}
Replace DISPLAY_NAME1,
DISPLAY_NAME2, and
DISPLAY_NAME3 with the name of the runnable, which
must be unique within the job—for example, script 1, barrier 1, and
script 2.
API
Use the REST API to
create a job that
includes the displayName field in one or more runnables definitions
in the JSON file:
...
"runnables":[
{
"displayName":DISPLAY_NAME,
...
}
]
...
For example, a job with custom status events that describes each runnable's state can have a JSON configuration file similar to the following:
{
"taskGroups":[
{
"taskSpec":{
"runnables":[
{
"displayName":"DISPLAY_NAME1",
"script":{
"text":"echo Hello world from script 1 for task ${BATCH_TASK_INDEX}"
}
},
{
"displayName":"DISPLAY_NAME2",
"barrier":{}
},
{
"displayName":"DISPLAY_NAME3",
"script":{
"text":"echo Hello world from script 2 for task ${BATCH_TASK_INDEX}"
}
}
]
},
"taskCount":3
}
],
"logsPolicy":{
"destination":"CLOUD_LOGGING"
}
}
Replace DISPLAY_NAME1,
DISPLAY_NAME2, and
DISPLAY_NAME3 with the name of the runnable, which
must be unique within the job—for example, script 1, barrier 1, and
script 2.
Go
import(
"context"
"fmt"
"io"
batch"cloud.google.com/go/batch/apiv1"
"cloud.google.com/go/batch/apiv1/batchpb"
durationpb"google.golang.org/protobuf/types/known/durationpb"
)
// Creates and runs a job with custom events
funccreateJobWithCustomEvents(wio.Writer,projectID,jobNamestring)(*batchpb.Job,error){
region:="us-central1"
displayName1:="script 1"
displayName2:="barrier 1"
displayName3:="script 2"
ctx:=context.Background()
batchClient,err:=batch.NewClient (ctx)
iferr!=nil{
returnnil,fmt.Errorf("batchClient error: %w",err)
}
deferbatchClient.Close()
runn1:=&batchpb.Runnable{
Executable:&batchpb.Runnable_Script_{
Script:&batchpb.Runnable_Script{
Command:&batchpb.Runnable_Script_Text{
Text:"echo Hello world from script 1 for task ${BATCH_TASK_INDEX}",
},
},
},
DisplayName:displayName1,
}
runn2:=&batchpb.Runnable{
Executable:&batchpb.Runnable_Barrier_{
Barrier:&batchpb.Runnable_Barrier{},
},
DisplayName:displayName2,
}
runn3:=&batchpb.Runnable{
Executable:&batchpb.Runnable_Script_{
Script:&batchpb.Runnable_Script{
Command:&batchpb.Runnable_Script_Text{
Text:"echo Hello world from script 2 for task ${BATCH_TASK_INDEX}",
},
},
},
DisplayName:displayName3,
}
runn4:=&batchpb.Runnable{
Executable:&batchpb.Runnable_Script_{
Script:&batchpb.Runnable_Script{
Command:&batchpb.Runnable_Script_Text{
Text:"sleep 30; echo '{\"batch/custom/event\": \"DESCRIPTION\"}'; sleep 30",
},
},
},
}
taskSpec:=&batchpb.TaskSpec{
ComputeResource:&batchpb.ComputeResource{
// CpuMilli is milliseconds per cpu-second. This means the task requires 2 whole CPUs.
CpuMilli:2000,
MemoryMib:16,
},
MaxRunDuration:&durationpb.Duration{
Seconds:3600,
},
MaxRetryCount:2,
Runnables:[]*batchpb.Runnable{runn1,runn2,runn3,runn4},
}
taskGroups:=[]*batchpb.TaskGroup{
{
TaskCount:4,
TaskSpec:taskSpec,
},
}
labels:=map[string]string{"env":"testing","type":"container"}
// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
allocationPolicy:=&batchpb.AllocationPolicy{
Instances:[]*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
Policy: &batchpb.AllocationPolicy_InstancePolicy{
MachineType: "e2-standard-4",
},
},
}},
}
// We use Cloud Logging as it's an out of the box available option
logsPolicy:=&batchpb.LogsPolicy{
Destination:batchpb.LogsPolicy_CLOUD_LOGGING ,
}
job:=&batchpb.Job{
Name:jobName,
TaskGroups:taskGroups,
AllocationPolicy:allocationPolicy,
Labels:labels,
LogsPolicy:logsPolicy,
}
request:=&batchpb.CreateJobRequest{
Parent:fmt.Sprintf("projects/%s/locations/%s",projectID,region),
JobId:jobName,
Job:job,
}
created_job,err:=batchClient.CreateJob(ctx,request)
iferr!=nil{
returnnil,fmt.Errorf("unable to create job: %w",err)
}
fmt.Fprintf(w,"Job created: %v\n",created_job)
returncreated_job,nil
}
Java
importcom.google.cloud.batch.v1.BatchServiceClient ;
importcom.google.cloud.batch.v1.CreateJobRequest ;
importcom.google.cloud.batch.v1.Job ;
importcom.google.cloud.batch.v1.LogsPolicy ;
importcom.google.cloud.batch.v1.LogsPolicy.Destination ;
importcom.google.cloud.batch.v1.Runnable ;
importcom.google.cloud.batch.v1.Runnable.Barrier ;
importcom.google.cloud.batch.v1.Runnable.Script ;
importcom.google.cloud.batch.v1.TaskGroup ;
importcom.google.cloud.batch.v1.TaskSpec ;
importcom.google.protobuf.Duration ;
importjava.io.IOException;
importjava.util.ArrayList;
importjava.util.List;
importjava.util.concurrent.ExecutionException;
importjava.util.concurrent.TimeUnit;
importjava.util.concurrent.TimeoutException;
publicclass CreateBatchCustomEvent{
publicstaticvoidmain(String[]args)
throwsIOException,ExecutionException,InterruptedException,TimeoutException{
// TODO(developer): Replace these variables before running the sample.
// Project ID or project number of the Google Cloud project you want to use.
StringprojectId="YOUR_PROJECT_ID";
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
Stringregion="europe-central2";
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
StringjobName="JOB_NAME";
// Name of the runnable, which must be unique
// within the job. For example: script 1, barrier 1, and script 2.
StringdisplayName1="script 1";
StringdisplayName2="barrier 1";
StringdisplayName3="script 2";
createBatchCustomEvent(projectId,region,jobName,displayName1,displayName2,displayName3);
}
// Configure custom status events, which describe a job's runnables,
// when you create and run a Batch job.
publicstaticJob createBatchCustomEvent(StringprojectId,Stringregion,StringjobName,
StringdisplayName1,StringdisplayName2,
StringdisplayName3)
throwsIOException,ExecutionException,InterruptedException,TimeoutException{
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests.
try(BatchServiceClient batchServiceClient=BatchServiceClient .create()){
TaskSpec task=TaskSpec .newBuilder()
// Jobs can be divided into tasks. In this case, we have only one task.
.addAllRunnables (buildRunnables(displayName1,displayName2,displayName3))
.setMaxRetryCount (2)
.setMaxRunDuration (Duration .newBuilder().setSeconds(3600).build())
.build();
// Tasks are grouped inside a job using TaskGroups.
// Currently, it's possible to have only one task group.
TaskGroup taskGroup=TaskGroup .newBuilder()
.setTaskCount (3)
.setParallelism (3)
.setTaskSpec (task)
.build();
Job job=
Job .newBuilder()
.addTaskGroups (taskGroup)
.putLabels("env","testing")
.putLabels("type","script")
// We use Cloud Logging as it's an out of the box available option.
.setLogsPolicy (
LogsPolicy .newBuilder().setDestination (Destination .CLOUD_LOGGING))
.build();
CreateJobRequest createJobRequest=
CreateJobRequest .newBuilder()
// The job's parent is the region in which the job will run.
.setParent(String.format("projects/%s/locations/%s",projectId,region))
.setJob(job)
.setJobId (jobName)
.build();
Job result=
batchServiceClient
.createJobCallable ()
.futureCall(createJobRequest)
.get(5,TimeUnit.MINUTES);
System.out.printf("Successfully created the job: %s",result.getName ());
returnresult;
}
}
// Create runnables with custom scripts
privatestaticIterable<Runnable>buildRunnables(StringdisplayName1,StringdisplayName2,
StringdisplayName3){
List<Runnable>runnables=newArrayList<>();
// Define what will be done as part of the job.
runnables.add(Runnable .newBuilder()
.setDisplayName (displayName1)
.setScript (
Script .newBuilder()
.setText (
"echo Hello world from script 1 for task ${BATCH_TASK_INDEX}")
// You can also run a script from a file. Just remember, that needs to be a
// script that's already on the VM that will be running the job.
// Using setText() and setPath() is mutually exclusive.
// .setPath("/tmp/test.sh")
)
.build());
runnables.add(Runnable .newBuilder()
.setDisplayName (displayName2)
.setBarrier (Barrier .newBuilder())
.build());
runnables.add(Runnable .newBuilder()
.setDisplayName (displayName3)
.setScript (
Script .newBuilder()
.setText ("echo Hello world from script 2 for task ${BATCH_TASK_INDEX}"))
.build());
runnables.add(Runnable .newBuilder()
.setScript (
Script .newBuilder()
// Replace DESCRIPTION with a description
// for the custom status event—for example, halfway done.
.sesetTextsleep30;echo'{\"batch/custom/event\": \"DESCRIPTION\"}'; sleep 30"))
.build());
returnrunnables;
}
}Node.js
// Imports the Batch library
constbatchLib=require('@google-cloud/batch');
constbatch=batchLib.protos.google.cloud.batch.v1;
// Instantiates a client
constbatchClient=newbatchLib.v1.BatchServiceClient ();
/**
* TODO(developer): Update these variables before running the sample.
*/
// Project ID or project number of the Google Cloud project you want to use.
constprojectId=awaitbatchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
constregion='europe-central2';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
constjobName='batch-custom-events-job';
// Name of the runnable, which must be unique
// within the job. For example: script 1, barrier 1, and script 2.
constdisplayName1='script 1';
constdisplayName2='barrier 1';
constdisplayName3='script 2';
// Create runnables with custom scripts
construnnable1=newbatch.Runnable({
displayName:displayName1,
script:newbatch.Runnable.Script({
commands:[
'-c',
'echo Hello world from script 1 for task ${BATCH_TASK_INDEX}.',
],
}),
});
construnnable2=newbatch.Runnable({
displayName:displayName2,
barrier:newbatch.Runnable.Barrier(),
});
construnnable3=newbatch.Runnable({
displayName:displayName3,
script:newbatch.Runnable.Script({
// Replace DESCRIPTION with a description
// for the custom status event—for example, halfway done.
commands:[
'sleep 30; echo \'{"batch/custom/event": "DESCRIPTION"}\'; sleep 30',
],
}),
});
consttask=newbatch.TaskSpec({
runnables:[runnable1,runnable2,runnable3],
maxRetryCount:2,
maxRunDuration:{seconds:3600},
});
// Tasks are grouped inside a job using TaskGroups.
constgroup=newbatch.TaskGroup({
taskCount:3,
taskSpec:task,
});
constjob=newbatch.Job({
name:jobName,
taskGroups:[group],
labels:{env:'testing',type:'script'},
// We use Cloud Logging as it's an option available out of the box
logsPolicy:newbatch.LogsPolicy({
destination:batch.LogsPolicy.Destination.CLOUD_LOGGING,
}),
});
// The job's parent is the project and region in which the job will run
constparent=`projects/${projectId}/locations/${region}`;
asyncfunctioncallCreateBatchCustomEvents(){
// Construct request
constrequest={
parent,
jobId:jobName,
job,
};
// Run request
const[response]=awaitbatchClient.createJob(request);
console.log(JSON.stringify(response));
}
awaitcallCreateBatchCustomEvents();Python
fromgoogle.cloudimport batch_v1
defcreate_job_with_status_events(
project_id: str, region: str, job_name: str
) -> batch_v1.Job:
"""
This method shows the creation of a Batch job with custom status events which describe runnables
Within the method, the state of a runnable is described by defining its display name.
The script text is modified to change the commands that are executed, and barriers are adjusted
to synchronize tasks at specific points.
Args:
project_id (str): project ID or project number of the Cloud project you want to use.
region (str): name of the region you want to use to run the job. Regions that are
available for Batch are listed on: https://cloud.google.com/batch/docs/locations
job_name (str): the name of the job that will be created.
It needs to be unique for each project and region pair.
Returns:
A job object representing the job created with additional runnables and custom events.
"""
client = batch_v1 .BatchServiceClient ()
# Executes a simple script that prints a message.
runn1 = batch_v1 .Runnable ()
runn1.display_name = "Script 1"
runn1.script.text = "echo Hello world from Script 1 for task ${BATCH_TASK_INDEX}"
# Acts as a barrier to synchronize the execution of subsequent runnables.
runn2 = batch_v1 .Runnable ()
runn2.display_name = "Barrier 1"
runn2.barrier = batch_v1 .Runnable .Barrier ({"name": "hello-barrier"})
# Executes another script that prints a message, intended to run after the barrier.
runn3 = batch_v1 .Runnable ()
runn3.display_name = "Script 2"
runn3.script.text = "echo Hello world from Script 2 for task ${BATCH_TASK_INDEX}"
# Executes a script that imitates a delay and creates a custom event for monitoring purposes.
runn4 = batch_v1 .Runnable ()
runn4.script.text = (
'sleep 30; echo \'{"batch/custom/event": "EVENT_DESCRIPTION"}\'; sleep 30'
)
# Jobs can be divided into tasks. In this case, we have only one task.
task = batch_v1 .TaskSpec ()
# Assigning a list of runnables to the task.
task.runnables = [runn1, runn2, runn3, runn4]
# We can specify what resources are requested by each task.
resources = batch_v1 .ComputeResource ()
resources.cpu_milli = 2000 # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memory_mib = 16 # in MiB
task.compute_resource = resources
task.max_retry_count = 2
task.max_run_duration = "3600s"
# Tasks are grouped inside a job using TaskGroups.
# Currently, it's possible to have only one task group.
group = batch_v1 .TaskGroup ()
group.task_count = 4
group.task_spec = task
# Policies are used to define on what kind of virtual machines the tasks will run on.
# In this case, we tell the system to use "e2-standard-4" machine type.
# Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
policy = batch_v1 .AllocationPolicy .InstancePolicy ()
policy.machine_type = "e2-standard-4"
instances = batch_v1 .AllocationPolicy .InstancePolicyOrTemplate ()
instances.policy = policy
allocation_policy = batch_v1 .AllocationPolicy ()
allocation_policy.instances = [instances]
job = batch_v1 .Job ()
job.task_groups = [group]
job.allocation_policy = allocation_policy
job.labels = {"env": "testing", "type": "container"}
# We use Cloud Logging as it's an out of the box available option
job.logs_policy = batch_v1 .LogsPolicy ()
job.logs_policy.destination = batch_v1 .LogsPolicy .Destination .CLOUD_LOGGING
create_request = batch_v1 .CreateJobRequest ()
create_request.job = job
create_request.job_id = job_name
# The job's parent is the region in which the job will run
create_request.parent = f"projects/{project_id}/locations/{region}"
return client.create_job (create_request)
After the example job has finished running, the resulting custom status events for each task are similar to the following:
statusEvents:
...
- description: 'script at index #0 with display name [DISPLAY_NAME1] started.'
eventTime: '...'
type: RUNNABLE_EVENT
- description: 'script at index #0 with display name [DISPLAY_NAME1] finished with exit
code 0.'
eventTime: '...'
type: RUNNABLE_EVENT
- description: 'barrier at index #2 with display name [DISPLAY_NAME2] reached.'
eventTime: '...'
type: BARRIER_REACHED_EVENT
- description: 'script at index #2 with display name [DISPLAY_NAME3] started.'
eventTime: '...'
type: RUNNABLE_EVENT
- description: 'script at index #2 with display name [DISPLAY_NAME3] finished with exit
code 0.'
eventTime: '...'
type: RUNNABLE_EVENT
...
Indicate important runtime events
You can configure custom status events that indicate when an important event
happens while a runnable is running by configuring that runnable to write a
structured task log that defines a string for the Batch
custom status event (batch/custom/event) field.
If a container runnable or script runnable writes a structured task log
that defines the batch/custom/event JSON field, it produces a
custom status event at that time. Although you might configure the structured
task log to include additional fields, the custom status event only includes
the string for the batch/custom/event field.
To create and run a job with custom status events that indicate when an
important event happens, configure one or more runnables to
write a structured log by printing JSON
and define the batch/custom/event field as part of the log.
...
"runnables":[
{
...
"echo '{\"batch/custom/event\":\"EVENT_DESCRIPTION\"}'"
...
}
]
...
"logsPolicy":{
"destination":"CLOUD_LOGGING"
}
...
For example, a job with custom status events that indicate when an important event happens can have a JSON configuration file similar to the following:
{
"taskGroups":[
{
"taskSpec":{
"runnables":[
{
"script":{
"text":"sleep 30; echo '{\"batch/custom/event\": \"EVENT_DESCRIPTION\"}'; sleep 30"
}
}
]
},
"taskCount":3
}
],
"logsPolicy":{
"destination":"CLOUD_LOGGING"
}
}
Replace EVENT_DESCRIPTION with a description for the
custom status event—for example, halfway done.
After the example job has finished running, the resulting custom status event for each task is similar to the following:
statusEvents:
...
- description: EVENT_DESCRIPTION
eventTime: '...'
type: RUNNABLE_CUSTOM_EVENT
...
What's next
- If you have issues creating or running a job, see Troubleshooting.
- Learn how to view status events.
- Learn how to write task logs.
- Learn about more job creation options.