Job Limits
Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:
news job.lim.MACHINENAME
Hardware
Each RZHound node is based on Intel Sapphire Rapids processor with 56 cores per socket, 2 sockets per node, and 256 GB DDR5 memory.
Scheduling
Batch jobs are scheduled through SLURM.
- pdebug—16 nodes (1702 cores), interactive use only.
- pbatch— 358 nodes (40096 cores), batch use only.
Pools Max nodes/job Max runtime --------------------------------------------------- pdebug 4(*) 1 hour pbatch 32(**) 24 hours ---------------------------------------------------
(*) Please limit the use of pdebug to 8 nodes on a PER USER basis, not a PER JOB basis, to allow other users access. Pdebug is scheduled using fairshare and jobs are core-scheduled, not node-scheduled. To allocate whole nodes, add a '--exclusive' flag to your sbatch or salloc command.
(**) In addition to the max-nodes / job limit, there is an additional limit of 32 nodes per user per bank across all of an individual user's jobs.
Do NOT run computationally intensive work on the login nodes. There are a limited number of login nodes which are meant primarily for editing files and launching jobs. A majority of the time when a login node is laggy, it is because a user has started up a compile on that login node.
Pdebug is intended for debugging, visualization, and other inherently interactive work. It is not intended for production work. Do not use pdebug to run batch jobs. Do not chain jobs to run one after the other. Individuals who misuse the pdebug queue in this or any similar manner may be denied access to running jobs in the pdebug queue.
Pdebug is core scheduled. To allocate whole nodes, add a '--exclusive' flag to your sbatch or salloc command.
Interactive access to a batch node is allowed while you have a batch job running on that node, and only for the purpose of monitoring your job. When logging into a batch node, be mindful of the impact your work has on the other jobs running on the node.
Scratch Disk Space: Consult CZ File Systems Web Page: https://lc.llnl.gov/fsstatus/fsstatus.cgi
Documentation
- Linux Clusters Tutorial Part One | Linux Clusters Part Two
- Slurm Tutorial (formerly Slurm and Moab)
Contact
Please call or send email to the LC Hotline if you have questions. LC Hotline | phone: 925-422-4531 | email: lc-hotline@llnl.gov
See Compilers page