Course

Scalable Job Submissions

Mar 13, 2025 - Mar 13, 2025
2 credits

Enroll

Full course description

Term: Spring 2025

Date: March 13th, 2025

Time: 2:00pm - 4:00pm

Location: University Mall 2104 & Online through Zoom

Instructor: Chris Kuhlman

Presesnted By: Advanced Research Computing (ARC)

 

Description:

A useful prerequisite for this workshop is the slurm workshop entitled “Introduction to Slurm.”  Slurm is used on many clusters to submit jobs (i.e., computational tasks) for execution on cluster resources.  Often one thinks of a single atomic operation or computation as one slurm job, and this is a useful exercise.  For many real-world problems, one does not run only one job, but rather executes many jobs to perform sensitivity and parametric studies and to study stochastic effects.  Here, “many jobs” can mean 10, 100, 1000, 10000, or more jobs.  There are multiple ways to run these many jobs.  A first approach is to use a bash script to create many slurm jobs and use a second bash script to submit those many-but-individual jobs.  In effect, this approach uses looping to create and then submit many slurm jobs.   A second approach is to run many jobs within one slurm submission script.  This effectively brings the loops inside of a single slurm job.  We will look at the pros and cons of each. At the end of this workshop, you will understand the various issues around scalable job submission and how to accomplish it.

 

Sign up for this course today!

Enroll