Policies

From NU HPC Wiki
Revision as of 09:19, 13 September 2023 by Admin (talk | contribs) (→‎Acceptable Use)
Jump to navigation Jump to search

Policies

Note

Please note that Shabyt was recently set up and is now open for general access. Software configurations are continually being updated. The policies described here are subject to change based on decisions made by the NU HPC Committee and actual system utilization.

Acceptable Use

The HPC system is a unique resource for NU researchers and the community. It has special characteristics, such as a large amount of RAM and the capability for massive parallelism. Due to its uniqueness and expense, its use is supervised by the HPC team to ensure fair and efficient utilization.

Storage quotas

Current default storage quota for users’ home directories is set to 100 GB. If any individual user requires more storage for his/her work, it can be allocated through a special request to the HPC admins. For particularly large, multi-terabyte storage needs Shabyt has an HDD array with the total capacity of 120 TB.

Data backup Please be advised that users take full responsibility for the integrity and safety of their data stored on Shabyt. While Shabyt features enterprise level hardware, failures are still a possibility, especially given no redundancy in the underlying storage systems that are designed for high throughput. Currently Shabyt does not have any automatic backup of users’ data. While this may change in the future as the HPC team continues to configure the system, at this time users are strongly advised to perform regular backup of any important data they may have stored in their home directories.

Queues and the number of jobs Currently, Shabyt has two partitions for user jobs. While at this time, when the system is still being configured and fine-tuned, there is no hardcoded limit on the number of jobs by any individual user, it will likely change in the near future.

Acknowledgment If the computational resources provided by NU HPC facilities were an important asset in your work resulting in a publication, we will greatly appreciate your acknowledgment.

Users are responsible for complying with the general policies.

Job Submission

Jobs are submitted using the SLURM batch system. Below are examples of batch scripts for different types of jobs:

Serial Job

  1. !/bin/bash
  2. SBATCH --job-name=Test_Serial
  3. SBATCH --nodes=1
  4. SBATCH --ntasks=1
  5. SBATCH --time=3-0:00:00
  6. SBATCH --mem=5G
  7. SBATCH --partition=CPU
  8. SBATCH --output=stdout%j.out
  9. SBATCH --error=stderr%j.out
  10. SBATCH --mail-type=END,FAIL
  11. SBATCH --mail-user=my.email@nu.edu.kz
  12. SBATCH --get-user-env
  13. SBATCH --no-requeue

pwd; hostname; date cp myfile1.dat myfile2.dat ./my_program myfile2.dat

SMP Job

  1. !/bin/bash
  2. SBATCH --job-name=Test_SMP
  3. SBATCH --nodes=1
  4. SBATCH --ntasks=1
  5. SBATCH --cpus-per-task=8
  6. SBATCH --time=3-0:00:00
  7. SBATCH --mem=20G
  8. SBATCH --partition=CPU
  9. SBATCH --output=stdout%j.out
  10. SBATCH --error=stderr%j.out
  11. SBATCH --mail-type=END,FAIL
  12. SBATCH --mail-user=my.email@nu.edu.kz
  13. SBATCH --get-user-env
  14. SBATCH --no-requeue

pwd; hostname; date export OMP_NUM_THREADS=8 ./my_smp_program myinput.inp > myoutput.out

Distributed Memory Parallelism (MPI) Job

  1. !/bin/bash
  2. SBATCH --job-name=Test_MPI
  3. SBATCH --nodes=2
  4. SBATCH --ntasks=256
  5. SBATCH --ntasks-per-node=128
  6. SBATCH --time=3-0:00:00
  7. SBATCH --mem=250G
  8. SBATCH --partition=CPU
  9. SBATCH --exclusive
  10. SBATCH --output=stdout%j.out
  11. SBATCH --error=stderr%j.out
  12. SBATCH --mail-type=END,FAIL
  13. SBATCH --mail-user=my.email@nu.edu.kz
  14. SBATCH --get-user-env
  15. SBATCH --no-requeue

pwd; hostname; date NP=${SLURM_NTASKS} module load gcc/9.3.0 module load openmpi/gcc9/4.1.0 mpirun -np ${NP} ./my_mpi_program myinput.inp > myoutput.out