Job submission test

From NU HPC Wiki
Revision as of 21:49, 22 October 2024 by Admin (talk | contribs)
Jump to navigation Jump to search

Job Submission – Hello World Examples for NU HPC System

This section provides working examples of SLURM job submission scripts for serial, MPI, and GPU "Hello World" tasks. Each example can be directly copied and run on the NU HPC system.

1. Single-Core Serial Job Example

Step 1: Creating the Python Script

First, create a Python script called hello_serial.py. This program prints "Hello World".

print("Hello World from Serial!")

Step 2: Creating the SLURM Batch Script

Next, create the SLURM batch script called submit_serial.sh to submit the serial job. This script specifies job parameters and runs the Python program:

#!/bin/bash
#SBATCH --job-name=hello_serial        # Job name
#SBATCH --ntasks=1                     # Use a single core
#SBATCH --time=0-00:05:00              # Time limit (5 minutes)
#SBATCH --output=serial_output_%j.log  # Standard output log
#SBATCH --error=serial_error_%j.log    # Standard error log

Load Python module (adjust as necessary)

module load python/3.8.5

Run the Python script

python3 hello_serial.py

2. MPI Job Example

Step 1: Creating the MPI Program

Create a simple MPI program called hello_mpi.c that prints "Hello World". The program can be written in C.

#include <mpi.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    printf("Hello World from process %d!\n", world_rank);
    MPI_Finalize();
    return 0;
}

Step 2: Creating the SLURM Batch Script

Now, create the SLURM batch script called submit_mpi.sh to submit the MPI job:

#!/bin/bash
#SBATCH --job-name=hello_mpi            # Job name
#SBATCH --nodes=2                        # Number of nodes
#SBATCH --ntasks=4                       # Total number of tasks (CPUs)
#SBATCH --time=0-00:05:00                # Time limit (5 minutes)
#SBATCH --output=mpi_output_%j.log      # Standard output log
#SBATCH --error=mpi_error_%j.log        # Standard error log

Load MPI module

module load openmpi/4.1.0

Compile the MPI program

mpicc -o hello_mpi hello_mpi.c

Run the MPI program

mpirun -np 4 ./hello_mpi

3. GPU Job Example

Step 1: Creating the CUDA Program

Create a simple CUDA program called hello_gpu.cu that prints "Hello World".

#include <stdio.h>

__global__ void helloFromGPU() {
    printf("Hello World from GPU!\n");
}

int main() {
    helloFromGPU<<<1, 1>>>();
    cudaDeviceSynchronize();
    return 0;
}

Step 2: Creating the SLURM Batch Script

Next, create the SLURM batch script called submit_gpu.sh to submit the GPU job:

#!/bin/bash
#SBATCH --job-name=hello_gpu           # Job name
#SBATCH --output=gpu_output_%j.log     # Standard output log
#SBATCH --error=gpu_error_%j.log       # Standard error log
#SBATCH --nodes=1                       # Number of nodes
#SBATCH --ntasks=1                      # Number of tasks (CPUs)
#SBATCH --gpus=a100:1                   # Request 1 GPU
#SBATCH --time=0-00:05:00               # Time limit (5 minutes)

Load CUDA module

module load cuda/11.4.1

Compile the CUDA program

nvcc -o hello_gpu hello_gpu.cu

Run the CUDA program

./hello_gpu


Useful SLURM Commands

Below are some useful SLURM commands for job submission, monitoring, and management:

Command Description
sbatch <file> Submit a job script
squeue View the current job queue
scancel <jobid> Cancel a job by job ID
sinfo Show information about available partitions
scontrol show job Display detailed information about a specific job
sacct View historical job accounting information

By following these steps, you can quickly test serial, MPI, and GPU jobs on the NU HPC system. Just copy the scripts, adjust as needed, and submit them using SLURM. ```