splitbatch

splitbatch(1) General Commands Manual splitbatch(1)
NAME
 splitbatch - Produce multiple command files for running batchruntomo in
 parallel
SYNOPSIS
 splitbatch -num # [-max #] batchruntomo_command_file
DESCRIPTION
 Splitbatch is a Python script that will take a command file for running
 the Batchruntomo program on multiple tilt series and produce multi-
 ple command files (jobs), with one data set per file, to be run in par-
 allel by Processchunks.
 The input command file should contain CPUMachineList and GPUMachineList
 entries to Batchruntomo with the full collection of resources avail-
 able for all of the jobs. Splitbatch will make sure that the CPUMa-
 chineList specifies at least two machines, but does not otherwise deal
 with the CPUs. Instead, Processchunks must be given this full pro-
 cessor list as well as the maximum number of jobs to run in parallel in
 its -M option. It will divide up the CPUs so that each job is directed
 to use a specific set of CPUs, and so that the first machine assigned
 for each job, the one the job is run on, has as many cores as possible
 to use for programs parallelized with OpenMP. In this way, competition
 between jobs for CPU cores should be minimized.
 Since GPUs are more scarce and used only occasionally, the GPU list is
 not subdivided and assigned either in the command files or by Pro-
 ceschunks(1). Instead, when a job needs to run a step with GPUs, it
 passes the full list of available GPUs to the Gpuallocator program
 along with the maximum number of GPUs that it would like to use. The
 program keeps track of GPUs allocated to the set of jobs and responds
 with whatever GPUs are available, from 1 to the given maximum. If none
 are available, Batchruntomo will keep trying indefinitely, issuing
 periodic warnings when the wait is long. It is not clear whether the
 best strategy is to set the maximum to the full number available or
 not. If it is set to the full number, then every job will get all the
 GPUs when they are free and other jobs will have to wait when they are
 already reserved. Otherwise, a job is much less likely to have to wait
 but will get fewer GPUs, perhaps only one, while the rest of the GPUs
 may become idle when their jobs finish.
 With this arrangement for allocating CPUs dynamically, the jobs must be
 run with Processchunks at the command line.
 All of the jobs can be controlled through a single file that Batchrun-
 tomo(1) checks for quit, pause, or finish signals. The name of this
 file is the rootname of the bacth input file with extension ".cmds".
 To make all jobs quit as soon as possible, use the command
 echo Q > rootname.cmds
 or use F instead of Q to make them all quit after finishing their cur-
 rent data sets.
 Using a Cluster Queue
 The management of resources is completely different when running
 Batchruntomo in parallel on a cluster queue. Four different differ-
 ent arrangements for use of cluster resources are supported:
 1) A single CPU per run. The top-level Processchunks puts each
 Batchruntomo job on the queue up to the number to run in parallel.
 The batch command file has information about the queue command and max-
 imum number of jobs to submit at once, subject to override by environ-
 ment variables set by Processchunks when it runs it. selections to
 take effect. Batchruntomo runs single command files directly and
 runs operations that can be parallelized in chunks on the queue with
 Processchunks. Everything is limited to a single thread, so opera-
 tions parallelized with multi-threading (most of the single command
 files) will run more slowly than usual. This mode provides efficient
 use of CPU resources on the cluster but may incur less efficient use of
 the file system and memory caching if many data sets are run at once.
 2) Multiple CPU cores and one or more GPUs per run. This requires
 that the Queuechunk command being used (generally obtained by Etomo
 from the cpu.adoc) has a resource request for a specific number of
 cores and possible GPUs. Batchruntomo runs single command files
 directly and allows them to use all the cores available for multi-
 threaded operations. Operations on a CPU that can be parallelized in
 chunks are run directly with Processchunks, up to given number of
 cores. Operations on a GPU are also run directly in chunks if more
 than one GPU is available, or as a single command file if there is only
 one GPU. This mode is comparable to running Batchruntomo on a sin-
 gle computer, where sometimes the cores and more often the GPU are
 fully utilized.
 3-1 and 3-2) A separate queue for GPU operations, with one GPU per
 task. The main queue would not have a GPU allocated. Steps using a GPU
 are run on this secondary queue. Because Batchruntomo has to submit
 each GPU operation to run on this queue, and no single IMOD process
 uses more than one GPU, only a single GPU can be taken advantage of by
 each command file or chunk being run. When more than one GPU is avail-
 able overall on this queue, operations will be spit into chunks and
 each chunk submitted to the queue; otherwise a single command file will
 be submitted. In mode 3-1, only a single core is available and non-GPU
 operations are run as for mode 1. In mode 3-2, multiple cores are
 available to each job and non-GPU operations are run as for mode 2.
 This mode allows more efficient use of GPUs on the cluster, provided
 that the cluster is configured to allocate just one GPU from a multi-
 GPU node when the queue submission requests that. If not, mode 2 is
 best.
 Each of these modes involves a particular set of options give to Pro-
 cesschunks(1) and Batchruntomo. The options in the batch command
 file will govern if the command file is run outside of Process-
 chunks(1), but if it is run by Processchunks, the options provided
 to the latter override the ones in the command file.
 Mode 1) The queue is specified to Processchunks by
 -q: the maximum number of queue entries
 The machine list, a queuechunk command
 and to Batchruntomo by
 -QueueCommand: the queuechunk command
 -MaxJobsOnQueue: the maximum number of queue entries
 Mode 2) The queue is specified to Processchunks by
 -q: the maximum number of queue entries
 The machine list, a queuechunk command
 -JC: the number of cores
 -JG: the number of GPUs
 and to Batchruntomo by:
 -CoresPerClusterJob: the number of cores
 -GPUsPerClusterJob: the number of GPUs
 Modes 3-1 and 3-2) The secondary queue is specified to Process-
 chunks(1) by
 -SQ: the queuechunk command
 -SN: the maximum number of queue entries to submit at once
 and to Batchruntomo by
 -GPUQueueCommand: the queuchunk command
 -MaxGPUJobsOnQueue: the maximum number of queue entries
 The primary queue is specified to both programs as for modes 1 and 2,
 respectively, except that for mode 2, only cores should be specified,
 not GPUs.
 To enforce consistent usage, Splitbatch will object if the Batchrun-
 tomo(1) command file has either of the CPUMachineList or GPUMachineList
 options for non-cluster processing included along with the main queue
 options, -QueueCommand or -CoresPerClusterJob.
OPTIONS
 When the program is invoked with no arguments or with -h, it gives a
 usage statement that shows the default values for these options as well
 as the currently allowed abbreviations to the short option names.
 -comfile OR -CommandFile File
 The input Batchruntomo command file name, with or without its
 extension. This entry is required. The command file can be
 entered either specifically with this option or as a non-option
 argument.
 -maxgpu OR -MaxGPUsForOneJob value
 Maximum number of GPUs to request for a step that needs a GPU.
 The default is 4.
 -help OR -usage
 Print a usage statement and exit.
CLUSTER EXAMPLES
 This section lists some Processchunks commands for running in the
 different modes, and also shows how the entries should appear in
 cpu.adoc
 Mode 1 - a simple queue named "cluster":
 processchunks -M 4 -q 32 -Q cluster 'queuechunk -t pbs' \
 batchJul25-193628_BB1
 cpu.adoc:
 [Queue = cluster]
 command = queuechunk -t pbs
 number = 300
 Mode 2 - a queue where multiple cores and GPU's can be requested:
 processchunks -M 4 -q 6 -JC 8 -JG 4 'queuechunk -t slurm -l
 -c8,-n1,--partition=sgpu' \
 batchJul25-193628_BB1
 cpu.adoc:
 [Queue = BatchGPU]
 command = queuechunk -t slurm -l -c8,-n1,--partition=sgpu
 coresPerClusterJob = 8
 gpusPerClusterJob = 4
 number = 8
 Such a configuration is best for accessing GPUs when the cluster
 software is not configured to give only on GPU instead of all the
 GPUs on the node. Here the "number" would be set to the number of
 such GPU nodes available on the cluster. The coresPerClusterJob
 should be set to the same number as whatever entry in the queue
 command specifies the number of cores to allocate ("-c8" here). Note
 that the attributes coresPerClusterJob and gpusPerClusterJob were
 originally named coresPerNode and gpusPerNode prior to 4.12.40. Both
 were misleading because such a job may not get an entire cluster
 node, and using coresPerNode in this context conflicted with the
 previous usage of it. coresPerNode is still used when specifying a
 slurm queue running in exclusive resource allocation mode, but such a
 queue is not suitable for multi-core processing within a batch job
 and will be treated by Etomo as a simple queue having one core per
 job.
 Mode 3-1 - a primary queue with one core and a secondary queue with one
 GPU:
 processchunks -M 4 -q 32 -SN 4 -SQ 'queuechunk -t slurm -l
 -c1,-n1,-G1,--partition=aa100
 'queuechunk -t slurm -l -c1,-n1,--partition=amilan' \
 batchJul25-193628_BB1
 cpu.adoc:
 [Queue = BatchPrimary]
 command = queuechunk -t slurm -l -c1,-n1,--partition=amilan
 number = 300
 [Queue = alpine-1GPU]
 command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
 coresPerClusterJob = 1
 gpusPerClusterJob = 1
 number = 24
 Here the number might be the total number of GPUs available (e.g., 8
 machines with 3 each), provided that the cluster software can assign
 each one separately. With "-SN 4", each batch job will submit up to
 4
 chunks to the queue with GPUs. If the queue software does allow
 getting more than one core per job, it might be preferable to use the
 next setup instead:
 Mode 3-2 - a primary queue with six cores and a secondary queue with
 one GPU:
 processchunks -M 4 -q 32 -SN 4 -SQ 'queuechunk -t slurm -l
 -c1,-n1,-G1,--partition=aa100
 'queuechunk -t slurm -l -c6,-n1,--partition=amilan' \
 batchJul25-193628_BB1
 cpu.adoc:
 [Queue = BatchCores]
 command = queuechunk -t slurm -l -c6,-n1,--partition=amilan
 coresPerClusterJob = 6
 number = 50
 [Queue = alpine-1GPU]
 command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
 coresPerClusterJob = 1
 gpusPerClusterJob = 1
 number = 24
 If you define any multicore queues, note that these are useful only for
 parallel Batchruntomo processing. For all other parallel processing
 in Etomo, you need to define an additional simple queue with only one
 core allocated per job, since that is all that will be used in those
 situations.
FILES
 The command files are given the same root name as the input file and
 are numbered from "-001". There is a finishing file to remove command
 files, but the log files are left.
AUTHOR
 David Mastronarde <mast at colorado dot edu>
SEE ALSO
 processchunks, batchruntomo
 splitbatch(1)