Skip to main content

Quickstart Guide for Carya/Sabine

How to log in

The only way to connect to our clusters is by secure shell (ssh), e.g. from a Linux/UNIX system:

ssh -l your_username carya.rcdc.uh.edu 
ssh -l your_username sabine.rcdc.uh.edu

You will use cougarnet ID as your_username and cougarnet password as password to log in. Windows users will need an SSH client installed on their machine, see e.g.  PuTTY or Mobaxterm or XShell.

VSCODE is not supported and not allowed on the cluster.

UH VPN is now mandatory for accessing the clusters from outside the campus network. Note, Windows users should avoid using WSL2 to connect over VPN as  WSL2 is currently incompatible with UH VPN. Instead use either PuTTY or Mobaxterm or XShell.

Best Practices on Login Nodes

The login nodes are not appropriate for computational work as they are shared among all users. Parallel applications including MPI and multithreaded applications are not permitted on the frontends or login nodes; instead, short parallel test runs should be carried out using slurm batch jobs. Additionally, interactive batch jobs can be submitted, which, upon initiation, open a shell on one of the allocated compute nodes, allowing users to run interactive programs there.

Allocations

Users without project allocations cannot run jobs on Sabine or Opuntia. Users have been given a small allocions on Opuntia to continue running jobs there. For increased job allocations please refer to the allocation request. A PI (supervisor) will have to submit a project proposal for Sabine/Opuntia.

Users can check the balance for their projects using the sbalance command:

sbalance balance statement project <projectname>

Users with multiple allocations for instance working for two different PIs(supervisors) would need to specify the allocation upon submission of the job in the batch script so the right PIs allocation is used, e.g. :

#!/bin/bash
### Specify job parameters
#SBATCH -J test_job # name of the job
#SBATCH -t 1:00:00 # time requested
#SBATCH -N 1 -n 2 # total number of nodes and processes

## if you have multiple allocations
## you can tell SLURM which account to charge this job to
#SBATCH -A #Allocation_AWARD_ID

or specify it when submitting an interactive job, e.g.:

srun  -A #Allocation_AWARD_ID --pty  /bin/bash

Using tmux

Using tmux on the Carya/Sabine cluster allows you to create interactive allocations that you can detach from. Normally, if you get an interactive allocation (e.g. srun --pty) then disconnect from the cluster, for example by putting your laptop to sleep, your allocation will be terminated and your job killed. Using tmux, you can detach gracefully and tmux will maintain your allocation. Here is how to do this correctly:

  1. ssh to Sabine/Opuntia.
  2. Start tmux.
  3. Inside your tmux session, submit an interactive job with srun.
  4. Inside your job allocation (on a compute node), start your application (e.g. matlab).
  5. Detach from tmux by typing Ctrl+b then d .
  6. Later, on the same login node, reattach by running tmux attach

Make sure to:

  • run tmux on the login node, NOT on compute nodes
  • run srun inside tmux, not the reverse.

X11 Forwarding

X11 forwarding is necessary to display editor windows (gvim, emacs, nedit, etc.) or similar on your desktop. To enable X11 forwarding, log in with the ssh -X or -Y options enabled

ssh -XY -l your_username  carya.rcdc.uh.edu
ssh -XY -l your_username sabine.rcdc.uh.edu

Windows users need an X server to handle the local display in addition to the ssh program, see  this intro (from the University of Indiana) for PuTTY users.

Transferring Data

Basic Tools

SCP (Secure CoPy): scp uses ssh for data transfer, and uses the same authentication and provides the same security as ssh. For example, copying from a local system to Carya:

scp myfile your_username@carya.rcdc.uh.edu:

scp myfile your_username@sabine.rcdc.uh.edu:

To recursively copy directory
scp -r my_directory your_username@carya.rcdc.uh.edu:

SFTP (Secure File Transfer Protocol): sftp is a file transfer program, similar to ftp, which performs all operations over an encrypted ssh transport. Example,  put file from local system to Sabine (this also works for Carya):

sftp uusername@sabine.rcdc.uh.edu 
Password: 
Connected to sabine.rcdc.uh.edu 

sftp> put myfile

For Windows users, WinSCP is a free graphical SCP and SFTP client.

RSYNC is a utility for efficiently transferring and synchronizing files between a computer and an external hard drive and across networked computers by comparing the modification times and sizes of files. Its primary advantage over scp is for fast synchronization by only copying new or updated files.  To transfer to carya, or sabine

rsync -avP file username@sabine.rcdc.uh.edu:path_to_destination_directory 
rsync -avP directory username@sabine.rcdc.uh.edu:path_to_destination_directory 

Data Transfer With GLOBUS

The Carya and Sabine clusters both Globus endpoints. Users can transfer files to/from Carya or Sabine using the globus web application. Users can also use the globus connect personal application to initiate data transfers with the clusters from their desktop. More details on using the globus at the DSI cluster are available here. Also, a youtube tutorial for globus connect personal here.

Software Environment

Text editors

Carya/Sabine have command line editors installed including emacs, nano and vim.

Modules

Modules are a tool for users to manage the Unix environment in sabine. It is designed to simplify login scripts. A single-user command,

module add module_name

can be invoked to source the appropriate environment information within the user’s current shell. Invoking the command,

module available

or use the abbreviated form

ml avail

or even shorter

ml av

will list the available packages on  Carya, or Sabine

module rm module_name

Will remove the module from your environment

Running Jobs

The Concept

A "job" refers to a program running on the compute nodes of the Carya, Opuntia, or Sabine clusters. Jobs can be run on clusters  in two different ways:

  • A batch job allows you to submit a script that tells the cluster how to run your program. Your program can run for long periods of time in the background, so you don't need to be connected to the cluster. The output of your program is continuously written to an output file that you can view both during and after your program runs.
  • An interactive job allows you to interact with a program by typing input, using a GUI, etc. But if your connection is interrupted, the job will abort. These are best for small, short-running jobs where you need to test out a program, or where you need to use the program's GUI.

The Code

The following shows how to run an example of a parallel program (using MPI) on Carya, Opuntia, or Sabine. MPI programs are executed as one or more processes; one process is typically assigned to one physical processor core. All the processes run the exact same program, but by receiving different inputs they can be made to do different tasks. The most common way to differentiate the processes is by their rank. Together with the total number of processes, referred to as size, they form the basic method of dividing the tasks between the processes. Getting the rank of a process and the total number of processes is therefore the goal of this example. Furthermore, all MPI-related instructions must be issued between MPI_Init() and MPI_Finalize(). Regular C instructions that are to be run locally for each process, e.g. some preprocessing that is equal for all processes, can be run outside the MPI context.

Below is a simple program that, when executed, will make each process print its name and rank as well as the total number of processes.

/*  Basic MPI Example - Hello World  */  
#include <stdio.h> /* printf and BUFSIZ defined there */ 
#include <stdlib.h> /* exit defined there */ 
#include "mpi.h" /* all MPI-2 functions defined there */   

int main(argc, argv) 
int argc; 
char *argv[]; 
{ 
int rank, size, length; 
char name[BUFSIZ];   

MPI_Init(&argc, &argv); 
MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
MPI_Comm_size(MPI_COMM_WORLD, &size); 
MPI_Get_processor_name(name, &length);   

printf("%s: hello world from process %d of %dn", name, rank, size);   

MPI_Finalize();   

exit(0); 
}
    • MPI_Init(); Is responsible for spawning processes and setting up the communication between them. The default communicator (collection of processes) MPI_COMM_WORLD is created.
    • MPI_Finalize(); End the MPI program.
    • MPI_Comm_rank( MPI_COMM_WORLD, &rank ); Returns the rank of the process within the communicator. The rank is used to divide tasks among the processes. The process with rank 0 might get some special task, while the rank of each process might correspond to distinct columns in a matrix, effectively partitioning the matrix between the processes.
    • MPI_Comm_size( MPI_COMM_WORLD, &size ); Returns the total number of processes within the communicator. This can be useful to e.g. know how many columns of a matrix each process will be assigned.
    • MPI_Get_processor_name( name, &length ); Is more of a curiosity than necessary in most programs; it can assure us that our MPI program is indeed running on more than one computer/node.

 

Compile & Run

Save the code in a file named helloworld.c. Load the Intel compiler and Intel MPI module files:

ml intel-oneapi 	

Compile the program with the following command:

mpiicx -o helloworld helloworld.c	

Make a batch job. Add the following in a file named job.sh

#!/bin/bash 
#SBATCH -J my_mpi_job 
#SBATCH -o my_mpi_job.o%j 
#SBATCH -t 00:01:00 
#SBATCH -N 2 -n 10 

ml intel-oneapi
mpirun ./helloworld

 

Submit the job to the queue.

sbatch job.sh
Submitted batch job 906

Note that the command sbatch returns the job ID. Note that the example runs fast. It can be finished before the status command returns a job identifier. The job identifier is used to name the output from the job together with the name of the job. The job name is given with -N option in the job.sh script. In this example, it is ‘my_mpi_job’. The standard output from the processes is logged to a log file in the working directory named my_mpi_job.o. Here is the content from on batch execution of the job.sh:

 cat my_mpi_job.o906 

compute-2-13.local: hello world from process 9 of 10 
compute-2-12.local: hello world from process 1 of 10 
compute-2-12.local: hello world from process 3 of 10 
compute-2-12.local: hello world from process 5 of 10 
compute-2-12.local: hello world from process 6 of 10 
compute-2-12.local: hello world from process 7 of 10 
compute-2-12.local: hello world from process 8 of 10 
compute-2-12.local: hello world from process 0 of 10 
compute-2-12.local: hello world from process 2 of 10 
compute-2-12.local: hello world from process 4 of 10
		

Note that the file my_mpi_job.e contains the output to standard error from all the processes. If the processes are executed without faults, no errors are logged (the file is empty).

SLURM Script Generator

An online job script generator application is provided at https://secure.hpedsi.uh.edu/slurm. The web application is designed to assist users in creating template SLURM job scripts for different clusters at HPE-DSI. It's a great starting point for creating various batch job workflows. Users are encouraged to customize the generated scripts to further suit their needs.

Batch Jobs

Note for Carya there is no special partition for gpus, so " -p gpu" is not needed when submitting jobs.

Users can check the status of a job with the squeue commands below.

squeue -j <JOB_ID>

Single whole node

#!/bin/bash 
#SBATCH -J my_mpi_job 
#SBATCH -o my_mpi_job.o%j 
#SBATCH -t 00:01:00 
#SBATCH -N 1 -n 28  

ml intel-oneapi  
mpirun ./helloworld
Multiple whole nodes

This example uses 4 nodes and 28 tasks or cores per node

#!/bin/bash
#SBATCH -J my_mpi_job
#SBATCH -o my_mpi_job.o%j
#SBATCH -t 00:01:00 
#SBATCH -N 4 --ntasks-per-node=28

module load intel-oneapi
mpirun ./helloworld
Single  core  job utilizing  1 GPU (if you need only a single CPU core and one GPU)
#!/bin/bash 
#SBATCH -J my_job 
#SBATCH -o my_job.o%j
#SBATCH -t 00:01:00 
#SBATCH -n 1
#SBATCH --gpus=1
 
ml CUDA
nvidia-smi  
./helloworld
Single  node job utilizing  1 GPU (if you need only one GPU but with multiple CPUs from the same node)
#!/bin/bash 
#SBATCH -J my_mpi_job
#SBATCH -o my_mpi_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 16
#SBATCH --gpus=1

ml CUDA

nvidia-smi  
mpirun ./helloworld
Single  node utilizing  2 GPUs (if you need two GPUs, along with multiple CPUs all from one node)
#!/bin/bash 
#SBATCH -J my_mpi_job 
#SBATCH -o my_mpi_job.o%j 
#SBATCH -t 00:01:00 
#SBATCH -N 1 -n 28
#SBATCH --gpus=2
ml CUDA intel-oneapi
nvidia-smi   mpirun ./helloworld
Multiple Whole nodes job 2 GPUS per node (only on Sabine). This example uses 4 nodes and 28 tasks or cores per node
#SBATCH -J my_mpi_job 
#SBATCH -o my_mpi_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 4 --ntasks-per-node=28

#SBATCH --gpus-per-node=2 

ml CUDA intel-oneapi

nvidia-smi  
mpirun ./helloworld

Batch Array Jobs

For running 10s to thousands for jobs you can use the job array mechanism. Assuming they have a contiguous serial number to distinguish the inputs and outputs. The  SLURM_ARRAY_TASK_ID variable.,will point to the serial number present in the file name.

Sample array job for 100 jobs

#!/bin/bash
#SBATCH -N 1 #number of nodes
#SBATCH --ntasks-per-node=1  #number of tasks per node
#SBATCH  -J myn_job       # job name
#SBATCH  -o myjob.o%j       # output and error file name (%j expands to jobID)
#SBATCH --time=4:00:00     # run time (hh:mm:ss) - 4 hours
#SBATCH --mem-per-cpu=2GB # assuming estimate for memory needed was 1-1.5 GB  
#SBATCH --mail-user=johnDoe@uh.edu
#SBATCH --mail-type=end    # email me when the job finishes, also includes efficiency report.
#SBATCH --array=1-100

./run_my_app input_$SLURM_ARRAY_TASK_ID.inp > output_$SLURM_ARRAY_TASK_ID.out    

Interactive Jobs

To open an interactive session on a compute node using the following
 salloc
Same as above, but requesting 1 hour of wall time  and X11 forwarding support
 salloc -t 1:00:00 --x11=first 
Same as above, but requesting 48 cores or a full node on Carya
 salloc -t 1:00:00 -n 48 -N 1 
Requesting 28 cores or a full node on Sabine
 salloc -t 1:00:00 -n 28 -N 1 
Same as above, but requesting 52 cores or a full node on Carya
 salloc -t 1:00:00 -n 52 -N 1
Requesting GPUS

Requesting 24 cores and 1 GPU on Carya
 salloc -t 1:00:00 -n 24 --gpus=1 -N 1
Requesting 48 cores or a full node and 2 GPUs on Carya
 salloc  -t 1:00:00 -n 48 --gpus=2 -N 1
Requesting 24 cores or a full node and 1 volta architecture GPU on Carya
 salloc  -t 1:00:00 -n 24 -N 1  --gpus=volta:1

Requesting L40S Ada architecture GPUand 20 cores per  node (on Carya)

 salloc -t 1:00:00 -n 32 -N 1 --gpus=ada:1 
Requesting 28 cores or a full node and 1 GPU on Sabine
 salloc -t 1:00:00 -n 28 --gpus=1 -N 1 
Requesting 28 cores or a full node and 2 GPUs on Sabine
 salloc -t 1:00:00 -n 28 --gpus=2 -N 1

Same as above, but requesting 4 nodes and 28 cores per  node (on Sabine)

 salloc -t 1:00:00 -ntasks-per-node=28 -N 4
Requesting 28 cores per node, 2 GPUs per node, and 4 nodes (on Sabine)
 salloc -t 1:00:00 --ntasks-per-node=28 --gpus=2 -N 4

Python Jobs

Batch Job Examples

Example of python batch job utilizing  1 CPU core


!/bin/bash 
#SBATCH -J  python_job
#SBATCH -o  python_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 1
#SBATCH --mem=4GB

ml Miniforge
python your_python_script.py

Conda Virtual Environments

Sometimes your python workflow requires creating a special virtual environment for running your jobs. Keep in mind the home directory has a size limit of 10 GB, so it's not advisable to install the virtual environment directly in your home directory, instead install them in your group's project directory using the path option in the conda create  -p /path/to/install/virtualenv  ...  command.

Below are steps to create and run your virtual environment.

Creating a virtual environment for Python 3.10

module add Miniforge3/py3.10

export CONDA_PKGS_DIRS=/project/your_PIs_project_name/your_user_name/conda_cache_dir

conda create -p /project/your_PIs_project_name/your_user_name/your_virtual_env_name

Sometimes your python workflow might require a different major version of Python beyond the default version, for instance Python 3.9 or Python 3.11. Below are steps to create an env for a different Python version other that the default 3.10 version

Creating a virtual environment for Python 3.11

module add Miniforge3/py3.10

export CONDA_PKGS_DIRS=/project/your_PIs_project_name/your_user_name/conda_cache_dir

conda create -p /project/your_PIs_project_name/your_user_name/myenv_3.11 python==3.11

Using the virtual environment

module add Miniforge3/py3.10 

source activate /project/your_PIs_project_name/your_user_name/your_virtual_env_name

export CONDA_PKGS_DIRS=/project/your_PIs_project_name/your_user_name/conda_cache_dir

#install python package(s) e.g. scipy and matplotlib
conda install scipy matplotlib

#test the installed package(s)
python -c "import scipy, matplotlib"

Managing Conda Cache

By default, Conda stores cached files in the user's home directory, which can quickly become full and lead to problems. To modify this behavior, you can change the cache directory by either setting the pkgs_dirs entry in the .condarc file or defining the CONDA_PKGS_DIRS environment variable. To find out the current cache directory, execute the following command:

module add Miniforge3/py3.10

Below is an example of the steps, for a potential user with the following profile

supervisor/PIs name = Dr. Dow Jones

project name = jones

cougarnet user name/id: jderick23

desired  custom conda virtual environment =  scikit-learn from conda-forge channel

module add Miniforge3/py3.10 

source $(dirname `which python`)/../etc/profile.d/conda.sh

export CONDA_PKGS_DIRS=/project/jones/jderick23/conda_cache_dir

conda create -p /project/jones/jderick23/my-scikit-learn -c conda-forge scikit-learn

To use the my-scikit-learn virtual environment you just created

module add Miniforge3/py3.10 

source activate /project/jones/jderick23/my-scikit-learn 

The example below shows how to use the "my-scikit-learn" environment inside a batch job


!/bin/bash 
#SBATCH -J  python_job
#SBATCH -o  python_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 1
#SBATCH --mem=4GB

module load Miniforge3/py3.10

source activate /project/jones/jderick23/my-scikit-learn 
python your_python_script.py

TensorFlow Jobs

Tensorflow is a standalone module and also within the conda python installations. The installed versions take can also advantage of GPUs, if executed on a node with GPU(s).
Batch Job Example
Single core job utilizing 1 GPU (if you need only a single CPU core and one GPU)

#!/bin/bash 
#SBATCH -J tensorflow_job
#SBATCH -o tensorflow_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 1
#SBATCH --gpus=1
#SBATCH --mem=32GB

module load TensorFlow
python convolutional_network.py
 

Single node dual core  job utilizing  2 GPUs and 2 CPUs (works only on Sabine) 

#!/bin/bash 
#SBATCH -J tensorflow_job
#SBATCH -o tensorflow_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 2
#SBATCH --gpus=2
#SBATCH --mem=64GB

module load TensorFlow
python convolutional_network.py

Pytorch and Torchvision Jobs

Pytocrch and Torchvision are available within python packages and as standalone modules. The installed versions take advantage of CPUs and GPUs. Note PyTorch is a dependency of torchvision, if you need the cluster compiled version of PyTorch and compatible torchvision simply load torchivison, e.g.

module add torchvision/0.15.2-foss-2022a-CUDA-11.7.0 

Batch Job Examples
Single  core  job utilizing 1 GPU (If you need only a single CPU core and one GPU)

 #!/bin/bash
 #SBATCH -J torch_job
 #SBATCH -o torch_job.o%j
 #SBATCH -t 00:01:00 
 #SBATCH -N 1 -n 1 
 #SBATCH --gpus=1
 #SBATCH --mem=32GB
     
ml torchvision 
python pytorch_script.py

Single node dual core  job utilizing  2 GPUs and 2 CPUs (works only on Sabine)

#!/bin/bash 
#SBATCH -J torch_job
#SBATCH -o torch_job.o%j
#SBATCH -t 00:01:00
#SBATCH -N 1 -n 2 
#SBATCH --gpus=2 
#SBATCH --mem=64GB
module load torchvision python pytorch_script.py

GROMACS Jobs

GROMACS is available as a module on the Sabine and Opuntia clusters. The installed versions can also take advantage of GPUs.

Batch GROMACS Jobs

Below are more examples for batch jobs requesting certain resources (the module names match the ones installed on Sabine - please adjust for Opuntia). Note -maxh is set to 4 hours to match the requested wall time, so Gromacs can end gracefully i.e. writing any checkpoint file or needed files before the slurm job time expires. 

Single Whole node

#!/bin/bash 
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 1 --ntasks-per-node=28

ml GROMACS
mpirun gmx_mpi mdrun -v -deffnm dhfr -maxh 4.0 
Single Whole GPU node
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00
#SBATCH -N 1 --ntasks-per-node=4
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-node=2

ml GROMACS
mpirun gmx_mpi mdrun -v -deffnm dhfr -maxh 4.0
Multiple Whole nodes
#!/bin/bash 
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 2 --ntasks-per-node=4

ml GROMACS mpirun gmx_mpi mdrun -v -deffnm dhfr -maxh 4.0

Multiple Whole GPU nodes


#!/bin/bash 
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00
#SBATCH -N 2 --ntasks-per-node=4
#SBATCH --cpus-per-task=7 
#SBATCH --gpus-per-node=2 ml GROMACS mpirun gmx_mpi mdrun -v -deffnm dhfr -maxh 4.0

NAMD  Jobs

NAMD is available as a module on the Sabine and Opuntia clusters. The installed versions can also take advantage of distributed memory processors using MPI.

Batch NAMD Jobs

Below are more examples for batch jobs requesting certain resources (the module names match the ones installed on Sabine - please adjust for Carya).  

Single Whole node


#!/bin/bash 
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 1 --ntasks-per-node=28

module add NAMD 
mpirun namd2 namd.conf 

Multiple Whole nodes


#!/bin/bash 
#SBATCH -J my_sim_job 
#SBATCH -o my_sim_job.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 2 
#SBATCH --ntasks-per-node=28  # Asking for 2 Nodes and 56 cores on Sabine 

module add NAMD
mpirun namd2 namd.conf

MATLAB  Jobs

MATLAB is available as a module on the Carya, Opuntia, and Sabine clusters. The installed versions can also take advantage of distributed memory processors. Keep in mind that Matlab jobs might require users to request more memory depending on the size of the input arrays and temporary arrays etc., used in the job.

Batch MATLAB Jobs

Below are more examples of Matlabbatch jobs requesting certain resources.  

Single core/processor


#!/bin/bash 
#SBATCH -J job_name
#SBATCH -o job_name.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 1 --ntasks-per-node=1
#SBATCH --mem-per-cpu=8gb #you might require more less of this amount
module add matlab #assuming your matlab code is stored in mycompute.m file, you can as shown below matlab -r mycompute

 Multiple processors on a single node


#!/bin/bash 
#SBATCH -J job_name 
#SBATCH -o job_name.o%j 
#SBATCH -t 04:00:00 
#SBATCH -N 1 # Asking for 1 Node
#SBATCH --ntasks-per-node=20 # Asking for 20 cores on Opuntia 
#SBATCH --mem-per-cpu=2gb #you might require more or less memory than this #assuming your matlab code is stored in mycompute.m file, you can as shown below module add matlab matlab -r mycompute

R and Rstudio Jobs

The R program is available as a module on the Carya, Opuntia, and Sabine clusters. The installed versions can also take advantage of distributed memory processors. Keep in mind that R jobs might require users to request more memory depending on the size of the input arrays and temporary arrays etc., used in the job.

 

Sample R or Rscript

 
#!/bin/bash
#SBATCH -N 1 #number of nodes
#SBATCH --ntasks-per-node=1  #number of tasks per node
#SBATCH -J myn_job   # job name
#SBATCH -o myjob.o%j   # output and error file name (%j expands to jobID)
#SBATCH --time=0:20:00      # run time (hh:mm:ss) 
#SBATCH --mem-per-cpu=4GB # assuming estimate for memory needed per cpu was 3.8 GBs
#SBATCH --mail-user=johnDoe@uh.edu
#SBATCH --mail-type=end    #email me when the job finishes, also includes efficiency report.
 
ml R

Rscript your_R_script.R 

AlphaFold Jobs

The AlphaFold program is available as a module on the Carya, and Sabine clusters. The installed versions use a Python script called "run_alphafold.py" as the main driver program. It also comes with support to run on NVIDIA GPU. Keep in mind that AlphaFold jobs might require users to request more memory than the default memory settings. A sample SLURM job script for AlphaFold is shown below, users are expected to provide their own sequence file in fasta format.

Sample AlphaFold job

 
#!/bin/bash 
#SBATCH -J myn_job
#SBATCH -o myjob.o%j 
#SBATCH --time=4:00:00
#SBATCH --mem-per-cpu=4GB
#SBATCH --mail-user=johnDoe@uh.edu
#SBATCH --mail-type=end   
#SBATCH --ntasks-per-node=14 #SBATCH -N 1 --gpus=1 ml AlphaFold run_alphafold.py --max_template_date=2021-11-01 --fasta_paths myseq.fasta --output_dir result_dir

Quantum Espresso Jobs

The Quantum Espresso Suite of programs is available as a module on the Carya, and Sabine clusters. The installed versions come with support to run in parallel across multiple CPU cores.

Sample Quantum Espresso job

 
#!/bin/bash 
#SBATCH -J myn_job
#SBATCH -o myjob.o%j 
#SBATCH --time=4:00:00
#SBATCH --mem-per-cpu=2GB
#SBATCH --mail-user=johnDoe@uh.edu
#SBATCH --mail-type=end 
#SBATCH -n 32 

ml QuantumESPRESSO
# using pw.x application
mpirun pw.x < input.in  > output.log

LAMMPS Jobs

LAMMPS is a classical molecular dynamics code with a focus on materials modeling. The LAMMPS program is available as a module on the Carya, and Sabine clusters. The installed versions come with support to run in parallel across multiple CPU cores.

Sample LAMMPS job

 
#!/bin/bash 
#SBATCH -J myn_job
#SBATCH -o myjob.o%j 
#SBATCH --time=4:00:00
#SBATCH --mem-per-cpu=2GB
#SBATCH --mail-user=johnDoe@uh.edu
#SBATCH --mail-type=end 
#SBATCH -n 32 

ml LAMMPS
# using lmp application
mpirun lmp < in.protein >log.protein