Course

Resource Utilization and Job Monitoring

Jul 17, 2025 - Jul 17, 2025
2 credits

Enroll

Full course description

Term: Summer 2025

Date: July 17th, 2025

Time: 10:00 a.m. to Noon

Location: Online Only

Instructor: Sarah Ghazanfari

Presented By: Advanced Research Computing (ARC)

 

Description:

This hands-on workshop is designed to equip researchers and HPC users with the skills to monitor, analyze, and optimize the performance of their computational jobs on ARC systems. Participants will gain a practical understanding of key system metrics, including CPU and memory usage, I/O demands, and GPU utilization, and how these factors interrelate to impact overall job efficiency.
The course will cover a range of powerful tools for performance monitoring, including command-line utilities such as seff, jobload, htop, gpumon, and sacct, along with insights into interpreting their output effectively. Additionally, participants will explore advanced topics such as login node resource monitoring, identifying and preventing wasteful jobs, and leveraging Grafana dashboards for visualizing real-time system performance.
Whether you're running individual simulations or managing large-scale workflows, this workshop will help you optimize your computational resource usage and contribute to more efficient use of shared HPC infrastructure.

Prerequisites:

Have an ARC account
 Preferred: Attended one or more previous ARC workshops, or have basic familiarity with ARC systems, Slurm, and remote development.

 

Sign up for this course today!

Enroll