Full course description
Term: Spring 2025
Date: March 13th, 2025
Time: 2:00pm - 4:00pm
Location: University Mall 2104 & Online through Zoom
Instructor: Chris Kuhlman
Presesnted By: Advanced Research Computing (ARC)
Description:
A useful prerequisite for this workshop is the slurm workshop entitled “Introduction to Slurm.” Slurm is used on many clusters to submit jobs (i.e., computational tasks) for execution on cluster resources. Often one thinks of a single atomic operation or computation as one slurm job, and this is a useful exercise. For many real-world problems, one does not run only one job, but rather executes many jobs to perform sensitivity and parametric studies and to study stochastic effects. Here, “many jobs” can mean 10, 100, 1000, 10000, or more jobs. There are multiple ways to run these many jobs. A first approach is to use a bash script to create many slurm jobs and use a second bash script to submit those many-but-individual jobs. In effect, this approach uses looping to create and then submit many slurm jobs. A second approach is to run many jobs within one slurm submission script. This effectively brings the loops inside of a single slurm job. We will look at the pros and cons of each. At the end of this workshop, you will understand the various issues around scalable job submission and how to accomplish it.