Mastering the Unix Shell: Advanced Commands and Scripting Techniques
Ended Jul 15, 2025
2 credits
Spots remaining: 16
Enrollment is closed
Add yourself to the wait list and you'll be auto enrolled when a spot opens
Full course description
Term: Summer 2025
Date: July 15th, 2025
Time: 2:00 to 4:00 p.m.
Location: Online Only
Instructor: Chris Kuhlman
Presented By: Advanced Research Computing (ARC)
Description:
Bash is a Unix/Linux shell and its commands form a programming language just like other languages (e.g., Python, Java). (Almost) all software-related projects, in all research fields, can benefit from the use of bash commands, utilities, and scripting. The larger the project (i.e., the larger the numbers and sizes of computations) the more value bash scripting provides. Bash commands can be (1) entered directly on the command line to accomplish tasks, or (2) written to files that are run just like files containing Python (or other interpreted) source code. To put bash scripting into context, these programs are often much smaller than programs made with Python, C++, and other languages, and each script does more narrowly-scoped work. But bash and other utility commands are designed so that small amounts of code (often, just one line) can do the work of many lines of Python or other programming language code. And these commands can be sequenced to create powerful pipelines. Bash commands, utilities, and scripting make your data preparation, code invocations, and data post-processing and management LESS error-prone owing to reproducibility. The workshop has two thrusts: (1) to convey concepts and ideas, (2) to provide examples that work on ARC resources.
Topics to be covered are: (1) variables and arrays, (2) functions, (3) math operations, (4) control structures [for, while, if, and case constructs], (5) sed utility for textual substitutions, (6) awk for searching/interrogating file contents, (7) sorting lines of a file, (8) input files for bash programs and output files, (9) useful commands such as grep, find, head, tail, and (10) larger examples.
You should consider attending if any of these apply: (1) you are a first-year graduate student (or younger), (2) you are in your first year of research computing, (3) you run more than one analysis one time, (4) you do not currently use bash scripting nor know what it is, (4) you work with data, (5) you have one or more operations to execute on data (e.g., pipelines), (6) you are new to Unix/Linux, (7) you want to increase your research/computing productivity. During only two days of your own intensive computing work, you will more than recoup in increased productivity the two hours that you spend in this workshop
Prerequisites:
The main prerequisites are conceptual; understanding these concepts is important: a directory, a file, a programming language, input data, output data.
A basic understanding of computer programming (in any language); even a beginner level is sufficient.
A basic understanding that you might want to manipulate data.
A basic understanding of UNIX/Linux shells is helpful, but not required (commands like cd, ls, pwd, mkdir).
An account on ARC and an account to run jobs (i.e., run codes) will eventually be required to make use of your knowledge, but not mandatory for the workshop.
Ability to connect to ARC clusters (either directly on campus or using VPN at home) is required to do the exercises, but you do not have to do the exercises during the workshop; the instructor will do the exercises.
No additional software (e.g., compilers) is needed; all required software is on the ARC clusters.