Allocations/Accounting
The basic unit of our computational resources is the service unit (SU), which represents the use of one CPU core for one hour. Using various CPU and GPU resources results however in different number of SUs being charged for a job, due to the differences in the performance of the various CPUs and GPUs across our computational resources, and the price difference respectively. These performance differences are expressed by a weight factor for each resource. The following tables show the weight factors used for various hardware components.
Carya Hardware Resource | Weight Factor |
---|---|
Intel Xeon G6252 CPU | 1.0 |
Nvidia V100 (Volta) GPU | 10 |
Sabine Hardware Resource | Weight Factor |
---|---|
Intel Xeon G6148 (Gen 10 ) CPU | 1.0 |
Intel Xeon E5-2680v4 (Gen 9) CPU | 0.85 |
Nvidia P100 (Pascal) GPU | 10 |
Nvidia V100 (Volta) GPU | 10 |
Opuntia Hardware Resource | Weight Factor |
---|---|
Intel E5-2680v2 | 1.0 |
Nvidia K40 (Tesla) GPU | 4.5 |
Calculation for Usage of SUs
Generally speaking, a job will be charged the following number of SUs
No. of SUs charged = duration of a job × ( no. of CPUs × CPU weight factor + no. of GPUs × GPU weight factor)
Example 1: a parallel MPI application is using the 128 cores on one of the Gen10 nodes on Sabine for 16 hours.
No. of SUs charged = 16 × ( 128 × 1.0) = 2,048 SUs
Example 2: an application using 1 P100 GPU and 4 cores of a Gen9 node on Sabine for 2 hours. (Note: a minimum of 1 CPU core has to be requested for any GPU job per node)
No. of SUs charged = 2 × ( 4 × 0.85 + 1 × 10 ) = 26.8 SUs
Calculation to Estimate SUs
For an allocation request, a user has to determine the resources required for a job, the approximate duration for running one job, and estimate the number of jobs to run in a year.
No. of SUs estimated = no. of jobs × duration of a job × ( no. of CPUs × CPU weight factor + no. of GPUs × GPU weight factor)
Example 1: a user estimates that he needs to run 250 jobs, each job using 128 CPU cores for 16 hours (no GPUs used in this example). The required number of SUs to request would be
No. of SUs estimated = 250 × 16 × (128 × 1.0 + 0) = 512,000 SUs
Example 2: a user estimates that he needs to run 2,000 jobs, each job using 2 CPU cores and 2 Nvidia V100 GPUs each for 8 hours.
No. of SUs estimated = 2,000 × 8 × ( 2 × 1.0 + 2 * 10 ) = 352,000 SUs